Many times it is required to count the occurrence of each word in a text file. To achieve so, we make use of a dictionary object that stores the word as the key and its count as the corresponding value. We iterate through each word in the file and add it to the dictionary with a count of 1. If the word is already present in the dictionary we increment its count by 1.
File sample.txt
First, we create a text file in which we want to count the words in Python. Let this file be sample.txt with the following contents
Mango banana apple pear Banana grapes strawberry Apple pear mango banana Kiwi apple mango strawberry
Example 1: Count occurrences of each word in a given text file
Here, we use a Python loop to read each line, and from that line, we are converting each line to lower for the unique count and then split each word to count its number.
Python3
text
=
open
[
"sample.txt"
,
"r"
]
d
=
dict
[]
for
line
in
text:
line
=
line.strip[]
line
=
line.lower[]
words
=
line.split[
" "
]
for
word
in
words:
if
word
in
d:
d[word]
=
d[word]
+
1
else
:
d[word]
=
1
for
key
in
list
[d.keys[]]:
print
[key,
":"
, d[key]]
Output:
mango : 3 banana : 3 apple : 3 pear : 2 grapes : 1 strawberry : 2 kiwi : 1
Example 2: Count occurrences of specific words in a given text file
In this example, we will count the number of “apples” present in the text file.
Python3
word
=
"apple"
count
=
0
with
open
[
"temp.txt"
,
'r'
] as f:
for
line
in
f:
words
=
line.split[]
for
i
in
words:
if
[i
=
=
word]:
count
=
count
+
1
print
[
"Occurrences of the word"
, word,
":"
, count]
Output:
Occurrences of the word apple: 2
Example 3: Count total occurrences of words in a given text file
In this example, we will count the total number of words present in a text file.
Python3
count
=
0
f
=
open
[
"sample.txt"
,
"r"
]
for
line
in
f:
word
=
line.split[
" "
]
count
+
=
len
[word]
print
[
"Total Number of Words: "
+
str
[count]]
f.close[]
Output:
Total Number of Words: 15
Consider the files with punctuation
Sample.txt:
Mango! banana apple pear. Banana, grapes strawberry. Apple- pear mango banana. Kiwi "apple" mango strawberry.
Code:
Python3
import
string
text
=
open
[
"sample.txt"
,
"r"
]
d
=
dict
[]
for
line
in
text:
line
=
line.strip[]
line
=
line.lower[]
line
=
line.translate[line.maketrans["
", "
", string.punctuation]]
words
=
line.split[
" "
]
for
word
in
words:
if
word
in
d:
d[word]
=
d[word]
+
1
else
:
d[word]
=
1
for
key
in
list
[d.keys[]]:
print
[key,
" "
, d[key]]
Output:
mango : 3 banana : 3 apple : 3 pear : 2 grapes : 1 strawberry : 2 kiwi : 1