top of page
Search

The Story of Keywords

  • Writer: Faruk Hasan
    Faruk Hasan
  • Aug 16, 2022
  • 1 min read

A Kolpokoushol Project

We have analysed data from a newspaper (Dhaka Tribune). The timeline was 2012 to 2016.We have given a big .json file with the data we need.In another words the project is mainly "News about news"!

Technical Aspects

We parsed the data that was provided to us by Kolpokoushol using the Python programming language. Parsing is a process that involves extracting certain information from a labyrinth of encoded text. The information was mainly keywords of each news. Afterwards, we made a couple of visualizations via various means.

Process


We started with JSON data to begin with. What a mess!

We used the code below to extract keywords from each news, and then find how many times each keywords occur

The result after parsing:

Finally, we wrote a code that returns all keywords with frequency greater than a threshold.

Results:

News about different sectors:

News about different sectors

WORD CLOUD:

Some assumptions from the graph

  1. The news about "Rape" is more frequent than any other crime news

  2. There is a peak in the Terrorism section

  3. After "Rape" Human Trafficking is our biggest problem according to the news quantity.

  4. Crime news is more frequent than any other news aspects

  5. There are much less Technology news

  6. There are more news of Cricket than Football

  7. There are something political going on

  8. There are more negative news than positive news


 
 
 

Comments


bottom of page