In this step, we define our categories and write the rules for each of our categories.
The defined categories are: Asia, Europe, Africa, Middle East, Latin America, United States, Conflicts, Finance, Technology, Consumer Electronics, World Politics, U.S. Politics, Astronomy, Paleontology, Health, Natural Disasters, Law, and Music News.
A rule is a query that selects documents for the category. For example, the category 'Asia' has a rule of 'China or Pakistan or India or Japan'. We insert our rules in the news_categories
table as follows:
insert into news_categories values (1,'United States','Washington or George Bush or Colin Powell'); insert into news_categories values (2,'Europe','England or Britain or Germany'); insert into news_categories values (3,'Middle East','Israel or Iran or Palestine'); insert into news_categories values(4,'Asia','China or Pakistan or India or Japan'); insert into news_categories values(5,'Africa','Egypt or Kenya or Nigeria'); insert into news_categories values (6,'Conflicts','war or soliders or military or troops'); insert into news_categories values(7,'Finance','profit or loss or wall street'); insert into news_categories values (8,'Technology','software or computer or Oracle or Intel or IBM or Microsoft'); insert into news_categories values (9,'Consumer electronics','HDTV or electronics'); insert into news_categories values (10,'Latin America','Venezuela or Colombia or Argentina or Brazil or Chile'); insert into news_categories values (11,'World Politics','Hugo Chavez or George Bush or Tony Blair or Saddam Hussein or United Nations'); insert into news_categories values (12,'US Politics','George Bush or Democrats or Republicans or civil rights or Senate or White House'); insert into news_categories values (13,'Astronomy','Jupiter or Earth or star or planet or Orion or Venus or Mercury or Mars or Milky Way or Telescope or astronomer or NASA or astronaut'); insert into news_categories values (14,'Paleontology','fossils or scientist or paleontologist or dinosaur or Nature'); insert into news_categories values (15,'Health','stem cells or embryo or health or medical or medicine or World Health Organization or AIDS or HIV or virus or centers for disease control or vaccination'); insert into news_categories values (16,'Natural Disasters','earthquake or hurricane or tornado'); insert into news_categories values (17,'Law','abortion or Supreme Court or illegal or legal or legislation'); insert into news_categories values (18,'Music News','piracy or anti-piracy or Recording Industry Association of America or copyright or copy-protection or CDs or music or artist or song'); commit;