The Story of Keywords
- Faruk Hasan
- Aug 16, 2022
- 1 min read
A Kolpokoushol Project
We have analysed data from a newspaper (Dhaka Tribune). The timeline was 2012 to 2016.We have given a big .json file with the data we need.In another words the project is mainly "News about news"!

Technical Aspects
We parsed the data that was provided to us by Kolpokoushol using the Python programming language. Parsing is a process that involves extracting certain information from a labyrinth of encoded text. The information was mainly keywords of each news. Afterwards, we made a couple of visualizations via various means.
Process
We started with JSON data to begin with. What a mess!

We used the code below to extract keywords from each news, and then find how many times each keywords occur

The result after parsing:

Finally, we wrote a code that returns all keywords with frequency greater than a threshold.
Results:

News about different sectors:

WORD CLOUD:



Some assumptions from the graph
The news about "Rape" is more frequent than any other crime news
There is a peak in the Terrorism section
After "Rape" Human Trafficking is our biggest problem according to the news quantity.
Crime news is more frequent than any other news aspects
There are much less Technology news
There are more news of Cricket than Football
There are something political going on
There are more negative news than positive news
Kommentarer