Bible Taxonomy – Parsing Input

BibleTax_Featured_4a
Note: This post is series explaining how I created the new Bible Taxonomy tool as seen on DiscipleShare. To see it in action, or to find great, free curriculum to use in churches, visit: http://www.discipleshare.net/

Here’s a confession: this part of the plugin is only half-baked. It’s a work in progress.

Users can select Books, Chapters, Verses using the Javascript tool. But what about quick entries? Surely a good user interface maximizes keystrokes and minimizes mouse clicks, right?

Well, I tried. The issue is how many permutations of entry types I’d have to deal with — let alone the abbreviations and commonplace names of books that I’d need to check for.

For instance, how many ways can you cite a scripture reference?

  • B
  • B C
  • B C:V
  • B C-C
  • B C:V-V
  • B C:V, V
  • B C:V – C:V
  • B C:V – B C:V
    • B = Book, C = Chapter, V = Verse. And this obviously doesn’t account for other styles of notation (such as using a period (.) instead of a semi-colon (:) ).

      And it also doesn’t account for complex references, such as B C-C:V, V-V or other similar, complicated references that the lectionary often chooses.

      Note: in the steps that follow, I wrapped it all in a “Service” class since the code would be re-used in several places

      Step 1: Figure out if what users enter is legit
      This wouldn’t be as big of an issue if they just used the Javascript widget. But the first rule of programming: if data comes from a user, don’t trust it. So we do some checks:

      function SanitizeVerses($bibleverseIn) {
      		// Strip Out unsafe characters in $bibleverseIn 
      		$bibleverse = htmlspecialchars($bibleverseIn, ENT_QUOTES, 'UTF-8');
      		
      		// Test string; if "." change to ":"
      		$bibleverse = str_replace(".", ":", $bibleverse);
      		
      		// take out spaces to make consistent and irrelevant on passedin %20 notation
      		$bibleverse = str_replace(" ", "", $bibleverse);
      		
      		return $bibleverse;
      		
      	}
      

      This takes out most punctuation, escape characters, and most importantly, spaces!

      Step 2: Figure out if it’s a ‘numbered book’
      This was actually step 5, but that’s because I didn’t plan properly. Oops.

      function GetNumberedBookStatus($string) {
      		$firstcar = substr($string, 0, 1);
      		if (is_numeric($firstcar)) {
      			return 1;
      		}
      		else {
      			return 0;
      		}
      	}
      

      Step 3: Determine the chapter number location
      If there are any trailing numbers behind the book name we want to know their location in the verses string so we can parse quicker.

      function GetChapterNumLoc($bookchapter) {
      		
      		if ($this->GetNumberedBookStatus($bookchapter) == 1) {
      			$string = substr($bookchapter, 1);
      		} else {
      			$string = $bookchapter;
      		}
      	
      		$nums = array();
      		
      		if (strpos($string, "1") > 0) {
      			$nums[] = strpos($string, "1");
      		}
      		if (strpos($string, "2") > 0) {
      			$nums[] = strpos($string, "2");
      		}
      		if (strpos($string, "3") > 0) {
      			$nums[] = strpos($string, "3");
      		}
      		if (strpos($string, "4") > 0) {
      			$nums[] = strpos($string, "4");
      		}
      		if (strpos($string, "5") > 0) {
      			$nums[] = strpos($string, "5");
      		}
      		if (strpos($string, "6") > 0) {
      			$nums[] = strpos($string, "6");
      		}
      		if (strpos($string, "7") > 0) {
      			$nums[] = strpos($string, "7");
      		}
      		if (strpos($string, "8") > 0) {
      			$nums[] = strpos($string, "8");
      		}
      		if (strpos($string, "9") > 0) {
      			$nums[] = strpos($string, "9");
      		}
      		if (strpos($string, "0") > 0) {
      			$nums[] = strpos($string, "0");
      		}
      		
      		if (count($nums) == 0) {
      			$chapternumloc = 0;
      		} else {
      			if ($this->GetNumberedBookStatus($bookchapter) == 1) {
      				$chapternumloc = min($nums)+1; // to return an accurate number for scrollnum
      			} else {
      				$chapternumloc = min($nums);
      			}
      		}
      		return $chapternumloc;
      	}
      

      I know there had to be an light-weight version that could do the same, but I didn’t want to take the time researching … it works.

      Step 4: Determine the format of the input syntax

      function GetVerseFormat($bibleverse) {
      		$splits = preg_split("/[\,\:\-]/", $bibleverse);
      		
      		$firststring = $splits[0];
      		if ($this->GetNumberedBookStatus($firststring) == 1) {
      			$firststring = substr($firststring, 1);
      		}
      		
      		$nums = array();
      		if (strpos($firststring, "1") > 0) {
      			$nums[] = strpos($firststring, "1");
      		}
      		if (strpos($firststring, "2") > 0) {
      			$nums[] = strpos($firststring, "2");
      		}
      		if (strpos($firststring, "3") > 0) {
      			$nums[] = strpos($firststring, "3");
      		}
      		if (strpos($firststring, "4") > 0) {
      			$nums[] = strpos($firststring, "4");
      		}
      		if (strpos($firststring, "5") > 0) {
      			$nums[] = strpos($firststring, "5");
      		}
      		if (strpos($firststring, "6") > 0) {
      			$nums[] = strpos($firststring, "6");
      		}
      		if (strpos($firststring, "7") > 0) {
      			$nums[] = strpos($firststring, "7");
      		}
      		if (strpos($firststring, "8") > 0) {
      			$nums[] = strpos($firststring, "8");
      		}
      		if (strpos($firststring, "9") > 0) {
      			$nums[] = strpos($firststring, "9");
      		}
      		if (strpos($firststring, "0") > 0) {
      			$nums[] = strpos($firststring, "0");
      		}
      		
      		if (count($nums) == 0) {
      			$chapternumloc = 0;
      		} else {
      			$chapternumloc = min($nums);
      		}
      		
      		$splitscount = count($splits);
      				
      		if ($chapternumloc < 2 || $chapternumloc == null) { // There is no chapter // verify $splits length = 1 before returning "B"
      			if ($splitscount == 1) {
      				return "B";
      			}
      			else {
      				return "Error: Unknown Format (Book should be followed by Chapter number)";
      			}
      		} else {
      			if ($splitscount == 1) {
      				return "B C";
      			} elseif ($splitscount == 2) {
      				if (strpos($bibleverse, ":") > 0) {
      					return "B C:V";
      				} elseif(strpos($bibleverse, "-") > 0) {
      					return "B C-C";
      				} else {
      					return "Error: Unknown Format (Are you missing a chapter?)";
      				}
      			} elseif ($splitscount == 3) {
      				if (substr($bibleverse, strpos($bibleverse,"-")+1, strlen($splits[2])) == $splits[2]) { // B C:V-V
      					return "B C:V-V";
      				} elseif (substr($bibleverse, strpos($bibleverse,",")+1, strlen($splits[2])) == $splits[2]) { // B C:V,V
      					return "B C:V,V";
      				} else {
      					return "Error: Unknown Format (Are you trying to reference more than one verse?  We suggest Book Chapter:Verse - Verse format)";
      				}
      			} elseif ($splitscount == 4) {
      				if (is_numeric($splits[2])) { // 3rd element is numeric, not a BC combo
      					return "B C:V - C:V";
      				} else {
      					return "B C:V - B C:V";
      				}
      			} else {
      				return "Error: Unknown Format";
      			}
      		}
      	}	
      

      Step 5: Setup the helper methods to query the XML using XPath with the input values

      function LookupBookId($xml, $book) {
      		try {
      			$results = $xml->xpath("//book[@name=\"".$book."\"]");
      			foreach($results as $resultnode) {
      				$atts = $resultnode->attributes();
      				return $atts["id"];
      			}
      			return "-1"; // none found
      		} catch (Exception $e) {
      			return "-1";
      		}
      	}
      	
      	function LookupChapterId($xml, $book, $chapter) {
      		try {
      			$results = $xml->xpath("//book[@name=\"".$book."\"]/chapter[@name=\"".$chapter."\"]");
      			foreach($results as $resultnode) {
      				$atts = $resultnode->attributes();
      				return $atts["id"];
      			}
      			return "-1"; // none found
      		} catch (Exception $e) {
      			return "-1";
      		}	
      	}
      	
      	function LookupVerseId($xml, $book, $chapter, $verse) {
      		try {
      			$results = $xml->xpath("//book[@name=\"".$book."\"]/chapter[@name=\"".$chapter."\"]/verse[@name=\"".$verse."\"]");
      			foreach($results as $resultnode) {
      				$atts = $resultnode->attributes();			
      				return $atts["id"];
      			}
      			return "-1"; // none found
      		} catch (Exception $e) {
      			return "-1";
      		}
      	}
      

      Step 6: Setup the class holder variables

      class IDWebService {
      	var $xml;
      	var $id_firstBook;
      	var $id_firstChapter;
      	var $id_firstVerse;
      	var $id_secondBook;
      	var $id_secondChapter;
      	var $id_secondVerse;
      	var $exceptionBool;
      	var $exceptionMessage;
      	var $verses;
      	var $sanitizedVerses;
      	
      	function IDWebService($verses) {
      		try {
      			$this->verses = $this->SanitizeVerses($verses);
      			$xmlurl = STYLESHEETPATH.'/bible-taxonomy/'.'Bible_min_with_IDs.xml';
      			$xml = simplexml_load_file($xmlurl);
      			$this->xml = $xml;
      			$this->ProcessVerses($this->verses);
      			
      			$this->exceptionBool = false;
      		} catch (Exception $e) {
      			$this->id_firstBook = null;
      			$this->id_firstChapter = null;
      			$this->id_firstVerse = null;
      			$this->id_secondBook = null;
      			$this->id_secondChapter = null;
      			$this->id_secondVerse = null;
      			$this->sanitizedVerses = null;
      			$this->exceptionBool = true;
      			$this->exceptionMessage = $e;
      		}
      	}
      	
      	function __destruct() { // to avoid a memory leak
      		unset($this->xml);
      		unset($this->id_firstBook);
      		unset($this->id_firstChapter);
      		unset($this->id_firstVerse);
      		unset($this->id_secondBook);
      		unset($this->id_secondChapter);
      		unset($this->id_secondVerse);
      		unset($this->exceptionBool);
      		unset($this->exceptionMessage);
      		unset($this->verses);
      		unset($this->sanitizedVerses);
      	}
      

      Step 7: Parse the heck out of it

      function ProcessVerses($verses) {
      		$splits = preg_split("/[\,\:\-]/", $verses);
      		
      		$book = $splits[0];
      		if ($this->GetNumberedBookStatus($book) == 1) {
      			$scrollnum = substr($book, 0, 1);
      			$scrollname = substr($book,1);
      			$book = $scrollnum.' '.$scrollname;
      		}
      		
      		$verseformat = $this->GetVerseFormat($verses);
      		switch ($verseformat) {
      			case "B":
      				if ($book == 'SongofSongs') {
      					$book = 'Song of Songs'; // despacifying makes XML miss ID
      				} elseif($book == 'ActsoftheApostles') {
      					$book = 'Acts of the Apostles';
      				}
      				
      				$bookid = $this->LookupBookId($this->xml, $book);
      				if ($bookid*1 >= 0) {
      					$this->id_firstBook = $bookid;
      					$this->sanitizedVerses = $book;
      				} else {
      					$this->ProcessError('Book not found');
      				}
      				break;
      			case "B C":
      				$book = substr($book, 0, $this->GetChapterNumLoc($book));
      				
      				if ($book == 'SongofSongs') {
      					$book = 'Song of Songs'; // despacifying makes XML miss ID
      				} elseif($book == 'ActsoftheApostles') {
      					$book = 'Acts of the Apostles';
      				}
      				
      				$chapter = substr($splits[0], $this->GetChapterNumLoc($splits[0]));
      				$bookid = $this->LookupBookId($this->xml, $book);
      				if ($bookid*1 >= 0) {
      					$this->id_firstBook = $bookid;
      					$chapterid = $this->LookupChapterId($this->xml, $book, $chapter);
      					if ($chapterid*1 >= 0) {
      						$this->id_firstChapter = $chapterid;
      						$this->sanitizedVerses = $book.' '.$chapter;
      					} else {
      						$this->ProcessError('Chapter not found');
      					}
      				} else {
      					$this->ProcessError('Book not found');
      				}
      				break;
      			case "B C:V":
      				$book = substr($book, 0, $this->GetChapterNumLoc($book));
      				
      				if ($book == 'SongofSongs') {
      					$book = 'Song of Songs'; // despacifying makes XML miss ID
      				} elseif($book == 'ActsoftheApostles') {
      					$book = 'Acts of the Apostles';
      				}
      				
      				$chapter = substr($splits[0], $this->GetChapterNumLoc($splits[0]));
      				$verse = $splits[1];
      				$bookid = $this->LookupBookId($this->xml, $book);
      				if ($bookid*1 >= 0) {
      					$this->id_firstBook = $bookid;
      					$chapterid = $this->LookupChapterId($this->xml, $book, $chapter);
      						if ($chapterid*1 >= 0) {
      							$this->id_firstChapter = $chapterid;
      							$verseid = $this->LookupVerseId($this->xml, $book, $chapter, $verse);
      							if ($verseid*1 >= 0) {
      								$this->id_firstVerse = $verseid;
      								$this->sanitizedVerses = $book.' '.$chapter.':'.$verse;
      							} else {
      								$this->ProcessError('Verse not found');
      							}
      						} else {
      							$this->ProcessError('Chapter not found');
      						}
      				} else {
      					$this->ProcessError('Book not found');
      				}
      				break;
      			case "B C-C":
      				$book = substr($book, 0, $this->GetChapterNumLoc($book));
      				
      				if ($book == 'SongofSongs') {
      					$book = 'Song of Songs'; // despacifying makes XML miss ID
      				} elseif($book == 'ActsoftheApostles') {
      					$book = 'Acts of the Apostles';
      				}
      				
      				$chapter = substr($splits[0], $this->GetChapterNumLoc($splits[0]));
      				$secondchapter = $splits[1];
      				$bookid = $this->LookupBookId($this->xml, $book);
      				if ($bookid*1 >= 0) {
      					$this->id_firstBook = $bookid;
      					$chapterid = $this->LookupChapterId($this->xml, $book, $chapter);
      					$secondchapterid = $this->LookupChapterId($this->xml, $book, $secondchapter);
      					if ($chapterid*1 >= 0 && $secondchapterid*1 >= 0) {
      						$this->id_firstChapter = $chapterid;
      						$this->id_secondChapter = $secondchapterid;	
      						$this->sanitizedVerses = $book.' '.$chapter.'-'.$secondchapter;		
      					} else {
      						$this->ProcessError('Chapter(s) not found');
      					}
      				} else {
      					$this->ProcessError('Book not found');
      				}
      				break;
      			case "B C:V-V":
      			case "B C:V,V":
      				$book = substr($book, 0, $this->GetChapterNumLoc($book));
      				
      				if ($book == 'SongofSongs') {
      					$book = 'Song of Songs'; // despacifying makes XML miss ID
      				} elseif($book == 'ActsoftheApostles') {
      					$book = 'Acts of the Apostles';
      				}
      				
      				$chapter = substr($splits[0], $this->GetChapterNumLoc($splits[0]));
      				$verse = $splits[1];
      				$secondverse = $splits[2];
      				$bookid = $this->LookupBookId($this->xml, $book);
      			
      				if ($bookid*1 >= 0) {
      					$this->id_firstBook = $bookid;
      					$chapterid = $this->LookupChapterId($this->xml, $book, $chapter);
      					if ($chapterid*1 >= 0) {
      						$this->id_firstChapter = $chapterid;
      						$verseid = $this->LookupVerseId($this->xml, $book, $chapter, $verse);
      						$secondverseid = $this->LookupVerseId($this->xml, $book, $chapter, $secondverse);
      						if ($verseid*1 >= 0 && $secondverseid *1 >= 0) {
      							$this->id_firstVerse = $verseid;
      							$this->id_secondVerse = $secondverseid;
      							$this->sanitizedVerses = $book.' '.$chapter.':'.$verse.'-'.$secondverse;
      						} else {
      							$this->ProcessError('Verse(s) not found');
      						}
      					} else {
      						$this->ProcessError('Chapter not found');
      					}
      				} else {
      					$this->ProcessError('Book not found');
      				}
      				break;
      			case "B C:V - C:V":
      				$book = substr($book, 0, $this->GetChapterNumLoc($book));
      				
      				if ($book == 'SongofSongs') {
      					$book = 'Song of Songs'; // despacifying makes XML miss ID
      				} elseif($book == 'ActsoftheApostles') {
      					$book = 'Acts of the Apostles';
      				}
      				
      				$chapter = substr($splits[0], $this->GetChapterNumLoc($splits[0]));
      				$verse = $splits[1];
      				$secondchapter = $splits[2];
      				$secondverse = $splits[3];
      				
      				$bookid = $this->LookupBookId($this->xml, $book);
      				if ($bookid*1 >= 0) {
      					$this->id_firstBook = $bookid;
      					$chapterid = $this->LookupChapterId($this->xml, $book, $chapter);
      					$secondchapterid = $this->LookupChapterId($this->xml, $book, $secondchapter);
      					if ($chapterid*1 >= 0) {
      						$this->id_firstChapter = $chapterid;
      						$verseid = $this->LookupVerseId($this->xml, $book, $chapter, $verse);
      						if ($verseid*1 >= 0) {
      							$this->id_firstVerse = $verseid;
      						} else {
      							$this->ProcessError('Verse not found');
      						}
      					} else {
      						$this->ProcessError('Chapter not found');
      					}
      					if ($secondchapterid*1 >= 0) {
      						$this->id_secondChapter = $secondchapterid;
      						$secondverseid = $this->LookupVerseId($this->xml, $book, $secondchapter, $secondverse);
      						if ($secondverseid*1 >= 0) {
      							$this->id_secondVerse = $secondverseid;
      							$this->sanitizedVerses = $book.' '.$chapter.':'.$verse.' - '.$secondchapter.':'.$secondverse;
      						} else {
      							$this->ProcessError('Verse not found');
      						}
      					} else {
      						$this->ProcessError('Chapter not found');
      					}
      				} else {
      					$this->ProcessError('Book not found');
      				}
      				break;
      			case "B C:V - B C:V":
      				$book = substr($book, 0, $this->GetChapterNumLoc($book));
      				$chapter = substr($splits[0], $this->GetChapterNumLoc($splits[0]));
      				$verse = $splits[1];
      				
      				if ($book == 'SongofSongs') {
      					$book = 'Song of Songs'; // despacifying makes XML miss ID
      				} elseif($book == 'ActsoftheApostles') {
      					$book = 'Acts of the Apostles';
      				}
      				if ($secondbook == 'SongofSongs') {
      					$secondbook = 'Song of Songs'; // despacifying makes XML miss ID
      				} elseif($secondbook == 'ActsoftheApostles') {
      					$secondbook = 'Acts of the Apostles';
      				}
      				
      				$secondbook = $splits[2];
      				if ($this->GetNumberedBookStatus($secondbook) == 1) {
      					$scrollnum = substr($secondbook, 0, 1);
      					$scrollname = substr($secondbook,1);
      					$secondbook = $scrollnum.' '.$scrollname;
      				}
      				$secondbook = substr($secondbook, 0, $this->GetChapterNumLoc($secondbook));
      				$secondchapter = substr($splits[2], $this->GetChapterNumLoc($splits[2]));
      				$secondverse = $splits[3];
      				$bookid = $this->LookupBookId($this->xml, $book);
      				$secondbookid = $this->LookupBookId($this->xml, $secondbook);
      	
      				if ($bookid*1 >= 0) {
      					$this->id_firstBook = $bookid;
      					$chapterid = $this->LookupChapterId($this->xml, $book, $chapter);
      					$secondchapterid = $this->LookupChapterId($this->xml, $secondbook, $secondchapter);
      					if ($chapterid*1 >= 0) {
      						$this->id_firstChapter = $chapterid;
      						$verseid = $this->LookupVerseId($this->xml, $book, $chapter, $verse);
      						if ($verseid*1 >= 0) {
      							$this->id_firstVerse = $verseid;
      						} else {
      							$this->ProcessError('Verse not found');
      						}
      					} else {
      						$this->ProcessError('Chapter not found');
      					}
      				}
      				if ($secondbookid*1 >= 0) {
      					$this->id_secondBook = $secondbookid;
      					if ($secondchapterid*1 >= 0) {
      						$this->id_secondChapter = $secondchapterid;
      						$secondverseid = $this->LookupVerseId($this->xml, $secondbook, $secondchapter, $secondverse);
      						if ($secondverseid*1 >= 0) {
      							$this->id_secondVerse = $secondverseid;
      							$this->sanitizedVerses = $book.' '.$chapter.':'.$verse.' - '.$secondbook.' '.$secondchapter.':'.$secondverse;
      
      						} else {
      							$this->ProcessError('Verse not found');
      						}
      					} else {
      						$this->ProcessError('Chapter not found');
      					}
      				}						
      				break;
      			case "Error: Unknown Format (Book should be followed by Chapter number)":
      			case "Error: Unknown Format (Are you missing a chapter?)":
      			case "Error: Unknown Format (Are you trying to reference more than one verse?  We suggest Book Chapter:Verse - Verse format)":
      				$this->ProcessError($verseformat);
      				break;
      			default:
      				$this->ProcessError("Error: unknown Format");
      		}
      	}
      

      Not elegant, but it works … :)

Speak Your Mind