I was playing the trivia game HQ with a friend yeaterday. We ended up shamefully googling some answers to the questions in the game. We were able to find most answers to the question in the first wiki page that it hits, but not fast enough. What if we find some patterns in how the answer to a trivia multiple choice is found via the internet? Here we have Trivi.
You can ask Trivi questions like
“A flashing red traffic light signifies that a driver should do what?”,
Also provide possible answers in the multiple-choice format:
“proceed with caution”
“honk the horn”,
Current algorithm: Trivi will search the question on google scrape the google search page for keyword hits Trivi will then search a corresponded Wikipedia article and crawl for keyword hits
Hits from both sources are weighted and added to champion the right answer.
Currently, Trivi do well in fact-based, non-numerical questions.
A good question for Trivi: When is Elon Musk Born?
A bad question for Trivi: When did Elon Musk marry Justine?
Key features to be noted:
– Weights applied for keywords crawled from google page itself, first wiki page, and second wiki page
– Weights applied for split word matching, eg. flat bread will be individually matched with “flat”, and with “bread”, but this type of match generate lower weights than whole word match “flat bread”
– Weights applied to merge match, when you try to match “flat bread” with “flatbread”, it is counted less
– It keeps a list of top 100 most frequently used words in English and if a splited answer contains that frequent word, it will not be matched. Eg. one answer says “beat the bush”, Trivi will split match “beat”, “bush”, but not “the”
Features to be added:
– Exclude obvious answers when answer appears in question eg. what color is the black box on a plane? exclude “black”
– Consider scrapping other sources other than wikipedia
– Build a front-end demo page for Trivi
Example output of the current UI
Contribute at https://github.com/rmr1012/trivi