English Learning in Big Data: A Keyword Analysis in 80 YouTube Videos
Namkil Kang 1
1 College of Liberal Arts, Far East
University, South Korea
|
ABSTRACT |
||
The main goal
of this paper is to analyze 80 YouTube videos in connection with English
Learning. With respect to word length, it is interesting to note that the
four-word expression has the highest frequency (159 tokens) and the highest
proportion (0.14). A major point to note is that YouTubers think of the
so-called word as an essential one for English learning. A further point to
note is that topic 10 was the most widely used by YouTubers, followed by
topic 2 (topic 7), topic 5, and topic 9, in that order. Talking about the
frequency of 80 YouTube videos, the word English was the most widely used
one, followed by video, shorts, practice (sentence, word), and learning
(vocabulary), in that order. Finally, this paper argues that the words
education, video, practice, word, speaking, news, class, study, vocabulary,
lesson, sentence, etc. are linked to English and learning. It is concluded
that these words linked to English and learning indicate essential
prerequisites for English learning. |
|||
Received 05 October 2022 Accepted 06 November 2022 Published 30 November 2022 Corresponding Author Namkil Kang, somerville@hanmail.net DOI10.29121/granthaalayah.v10.i11.2022.4864 Funding: This research
received no specific grant from any funding agency in the public, commercial,
or not-for-profit sectors. Copyright: © 2022 The
Author(s). This work is licensed under a Creative Commons
Attribution 4.0 International License. With the
license CC-BY, authors retain the copyright, allowing anyone to download,
reuse, re-print, modify, distribute, and/or copy their contribution. The work
must be properly attributed to its author. |
|||
Keywords: English Learning, Topic, Keyword, Youtube, Big Data, Visualization |
1. INTRODUCTION
The main purpose of this paper is to analyze 80 YouTube videos in connection with English learning. We collected 80 YouTube videos (on 12, 10) in terms of the YouTube data collector and analyzed them in terms of the software package NetMiner. First, we provide information on the frequency of word length. Second, we look into the frequency of words related to English learning. Third, we provide 10 topics which were much used in 80 YouTube videos. Each topic is constituted by 5 keywords used frequently in 80 YouTube videos. By analyzing 10 topics and their keywords, one can see what YouTubers think about English learning. Fourth, we consider how many times a particular word appear in 80 YouTube videos. That is to say, we examine the frequency of documents in which a word occurs. Finally, we provide the visualization of 26 words related with English learning. The organization of this paper is as follows. In section 3.1, we argue that the four-word expression has the highest frequency (159 tokens) and the highest proportion (0.14). In section 3.2, we further argue that YouTubers think of the so-called word as an indispensable keyword for English learning. In section 3.3, we contend that topic 10 was the most widely used by YouTubers, followed by topic 2 (topic 7), topic 5, and topic 9, in that order. In section 3.4, we maintain that the word English was the most widely used one, followed by video, shorts, practice (sentence, word), and learning (vocabulary), in that order. In section 3.5, we show that the words education, video, practice, word, speaking, news, class, study, vocabulary, lesson, sentence, etc. are linked to English and learning. This in turn suggests that they are all indispensable factors for English learning.
2. METHODS
The main goal of this paper is to analyze 80 YouTube videos collected on 12, 10, 2022 in connection with English learning. We collected them in terms of the YouTube data collector and analyzed them in terms of NetMiner. The main purpose of this paper is to answer the following questions: Can we provide the frequency of word length? Can we provide the frequency of words related with English learning? What are topics which are formed by main keywords? Can we provide information on the frequency of documents? Finally, can we provide the visualization of words related to English learning?
3. RESULTS
3.1. WORD LENGTH
The goal of this section is to provide the frequency of word length. Table 1 shows word length, its frequency, its proportion, and its cumulative proportion:
Table 1
Table 1 Word Length |
|||
Value |
Frequency |
Proportion |
Cumulative
Proportion |
2.0 |
27 |
0.024 |
0.024 |
3.0 |
71 |
0.063 |
0.086 |
4.0 |
159 |
0.14 |
0.227 |
5.0 |
158 |
0.139 |
0.366 |
6.0 |
133 |
0.117 |
0.483 |
7.0 |
117 |
0.103 |
0.586 |
8.0 |
83 |
0.073 |
0.66 |
9.0 |
72 |
0.063 |
0.723 |
10.0 |
37 |
0.033 |
0.756 |
11.0 |
31 |
0.027 |
0.783 |
12.0 |
20 |
0.018 |
0.801 |
13.0 |
37 |
0.033 |
0.833 |
14.0 |
19 |
0.017 |
0.85 |
15.0 |
23 |
0.02 |
0.87 |
16.0 |
9 |
0.008 |
0.878 |
17.0 |
11 |
0.01 |
0.888 |
18.0 |
12 |
0.011 |
0.899 |
19.0 |
15 |
0.013 |
0.912 |
20.0 |
11 |
0.01 |
0.922 |
21.0 |
9 |
0.008 |
0.929 |
22.0 |
11 |
0.01 |
0.939 |
23.0 |
10 |
0.009 |
0.948 |
24.0 |
7 |
0.006 |
0.954 |
25.0 |
10 |
0.009 |
0.963 |
26.0 |
5 |
0.004 |
0.967 |
27.0 |
4 |
0.004 |
0.971 |
28.0 |
3 |
0.003 |
0.974 |
29.0 |
6 |
0.005 |
0.979 |
30.0 |
1 |
0.001 |
0.98 |
31.0 |
1 |
0.001 |
0.981 |
33.0 |
1 |
0.001 |
0.981 |
35.0 |
2 |
0.002 |
0.983 |
36.0 |
1 |
0.001 |
0.984 |
37.0 |
6 |
0.005 |
0.989 |
38.0 |
4 |
0.004 |
0.993 |
41.0 |
1 |
0.001 |
0.994 |
46.0 |
1 |
0.001 |
0.995 |
47.0 |
1 |
0.001 |
0.996 |
49.0 |
1 |
0.001 |
0.996 |
54.0 |
1 |
0.001 |
0.997 |
58.0 |
1 |
0.001 |
0.998 |
65.0 |
1 |
0.001 |
0.999 |
86.0 |
1 |
0.001 |
1 |
Total |
1134 |
1 |
It is interesting to note that the four-word expression has the highest frequency (159 tokens) and the highest proportion. More interestingly, its proportion and their cumulative proportion is 0.14 and 0.227, respectively. It is also interesting to point out that the five-word expression is the second highest (158 tokens). Its proportion is 0.139 and its cumulative proportion is 0.366. It should be pointed out, on the other hand, that the six-word expression ranks third (133 tokens). Additionally, the seven-word expression ranks fourth (117 tokens). Its proportion is 0.103 and its cumulative proportion is 0.586. It is worthwhile noting that the eight-word expression is the fifth highest (83 tokens). Finally, it must be noted that the nine-word expression ranks sixth (72 tokens). Its proportion and its cumulative proportion is 0.063 and 0.723, respectively. We thus conclude that the four-word expression has the highest frequency (159 tokens) and the highest proportion (0.14).
3.2. FREQUENCY OF WORDS RELATED TO ENGLISH LEARNING
In this section, we aim to examine the frequency of words which are closely related to English learning. Table 2 shows the frequency of main words related to English learning:
Table 2
Table 2 Frequency of Words |
||
Words |
Part
of Speech |
Frequency |
Channel |
Noun |
10 |
Daily |
Adjective |
41 |
English |
Noun |
364 |
Learn |
Noun |
34 |
Learning |
Noun |
19 |
Use |
Noun |
11 |
Channel |
Noun |
19 |
Class |
Noun |
74 |
Education |
Noun |
16 |
English |
Noun |
56 |
Grammar |
Noun |
17 |
Language |
Noun |
21 |
Learning |
Noun |
28 |
Lesson |
Noun |
11 |
Meaning |
Noun |
51 |
News |
Noun |
10 |
Practice |
Noun |
59 |
Sentence |
Noun |
114 |
Shorts |
Noun |
38 |
Speaking |
Noun |
21 |
Study |
Noun |
13 |
Use |
Noun |
113 |
Video |
Noun |
79 |
Vocabulary |
Noun |
71 |
Word |
Noun |
196 |
Youtube shorts |
Noun |
10 |
As illustrated in Table 2, the word English was the most widely used one (364 tokens). Quite rightly, the word English has the highest frequency (364 tokens) and the highest proportion. It is worthwhile pointing out that word is the second most widely used one (196 tokens). This in turn suggests that YouTubers think of words as the most important keyword for English learning. It is natural that the word sentence ranks third (114 tokens), which implies that YouTubers think of the word sentence as essential. Quite interestingly, YouTubers believe that videos for English learning are also indispensable. Thus, the word video is the fifth highest among keywords. It should be noted, on the other hand, that the word vocabulary is the seventh highest. This in turn suggests that many YouTubers also think of vocabularies as important for English learning. That’s why the words vocabulary and word rank high. It is worthwhile pointing out that the word class ranks sixth (74 tokens). This in turn implies that many YouTubers believe that the so-called class is necessary for English learning. Finally, it should be pointed out that the word practice is the eighth highest, which in turn suggests that many YouTubers also judge it as necessary. We thus conclude that many YouTubers think of words as the most important for English learning.
3.3. TOPICS AND THEIR KEYWORDS
In this section, we provide ten topics and their keywords:
Table 3
Table 3 Topic Information |
|||||
1st Keyword |
2nd Keyword |
3rd Keyword |
4th Keyword |
5th Keyword |
|
Topic-1 |
Question |
Gk |
Answer |
Fluency |
Exam |
Topic-2 |
Complaylist |
Learning |
Level |
Shorts |
Day |
Topic-3 |
Practice |
English |
Conversation |
Beginner |
Language |
Topic-4 |
English |
Odia |
Use |
Odia |
Class |
Topic-5 |
English |
Short |
Tamil |
Speaking |
Youtube |
Topic-6 |
Education |
|
India |
Art |
Motivation |
Topic-7 |
Word |
English |
Meaning |
Shorts |
Use |
Topic-8 |
Video |
Learning |
Kid |
Learn |
Skill |
Topic-9 |
Sentence |
Kaise |
Use |
Practice |
Video |
Topic-10 |
English |
Course |
Bengali |
Spoken |
Learn |
As exemplified in Table 3, there are ten topics that were much used by YouTubers. It is important to note that topic 3 is constituted by 5 keywords such as practice, English, conversation, beginner, and language. This in turn implies that many YouTubers judge practice as the most important. Note that as can be seen from Table 3, the 1st keyword is practice. It is interesting to point out that in topic 1, the 1st keyword is the word question. This may indicate that many YouTubers think of it as the most important. Quite interestingly, five keywords such as video, learning, kid, Learn, and skill constitute topic 8. In this topic, the 1st keyword is video, which suggests that many YouTubers judge it as the most necessary. It is significant to note that as the 1st keyword, the word English was the most widely used by YouTubers, whereas the 2nd keyword, learning and English were equally the most used ones. It should be pointed out, on the other hand, that as the 3rd keyword, the word use was the most used one, whereas the 4th keyword, the word shorts was the most used one.
Now let us turn to the frequency of documents:
Table 4
Table 4 Frequency of Documents |
|
# of documents |
|
Topic-1 |
5 |
Topic-2 |
11 |
Topic-3 |
3 |
Topic-4 |
4 |
Topic-5 |
9 |
Topic-6 |
2 |
Topic-7 |
11 |
Topic-8 |
7 |
Topic-9 |
8 |
Topic-10 |
20 |
It is important to note that topic 10 was the most widely used one. More specifically, it occurred in 20 YouTube videos. As observed earlier, topic 10 is constituted by the keywords English, course, Bengali, Spoken, and learn. It is worth pointing out that topic 2 and topic 7 were the second most frequently used ones. They appeared in 11 YouTube videos. Topic 2 is formed by the keywords such as complaylist, learning, level, shorts, and day, whereas topic 7 is constituted by word, English, meaning, shorts, and use. It is noteworthy that topic 5 was the third most widely used one. That is to say, it occurred in 9 YouTube videos. Finally, topic 9 occurred in 8 YouTube videos. It ranks fourth among 10 topics. Note that topic 9 include the keywords sentence, kaise, use, practice, and video. It can thus be concluded that topic 10 was the most widely used one, followed by topic 2 (topic 7), topic 5, and topic 9, in that order.
3.4. DEGREE
The goal of this section is to provide information on
degree (the frequency of videos):
Table 5
Table 5 Degree |
||
Number |
Word |
Frequency |
1 |
English |
65 |
2 |
Video |
29 |
3 |
Shorts |
26 |
4 |
Practice |
22 |
5 |
Sentence |
22 |
6 |
Word |
22 |
7 |
Learning |
21 |
8 |
Vocabulary |
21 |
9 |
English |
19 |
10 |
Use |
18 |
11 |
Learn |
17 |
12 |
Speaking |
16 |
13 |
Daily |
15 |
14 |
Course |
15 |
15 |
Meaning |
14 |
16 |
Class |
13 |
17 |
Hindi |
11 |
18 |
Learning |
11 |
19 |
Sentences |
11 |
20 |
Short |
11 |
21 |
Practice |
10 |
22 |
Use |
10 |
23 |
Education |
10 |
24 |
Skill |
10 |
25 |
Channel |
9 |
26 |
Corn |
9 |
27 |
Conversation |
9 |
28 |
Day |
9 |
29 |
Grammar |
9 |
30 |
|
9 |
31 |
Lesson |
9 |
32 |
Translation |
9 |
33 |
Youtube |
9 |
34 |
Link |
8 |
35 |
Bolna |
8 |
36 |
Classis |
8 |
37 |
Language |
8 |
38 |
Level |
8 |
39 |
Study |
8 |
40 |
Basic |
7 |
41 |
Life |
7 |
42 |
News |
7 |
43 |
Research |
7 |
44 |
Youtubeshorts |
7 |
45 |
Channel |
6 |
46 |
LEARN |
6 |
47 |
Spoken |
6 |
48 |
Subscribe |
6 |
49 |
Translation |
6 |
50 |
Beginner |
6 |
Table 5 indicates the frequency of videos in which a particular word appear. It is significant to note that the word English appeared 65 YouTube videos. This in turn indicates that it was the most widely used one in 65 YouTube videos. It is interesting to note, on the other hand, that the word video was the second most widely used one. Quite interestingly, it appeared in 29 YouTube videos. This in turn indicates that many YouTubers believe that videos are an effective way to learn English. It is worth pointing out that the word practice occurred in 22 YouTube videos, which in turn indicates that many YouTubers judge it as essential. It must be pointed out, on the other hand, that the word sentence was the fourth most widely used one. Quite interestingly, it appeared in 22 YouTube videos. Likewise, word occurred in 22 YouTube videos and was the fourth most frequently used one. This in turn suggests that the so-called word is considered as essential by YouTubers. The word vocabulary is more or less the same as word. It occurred in 21 YouTube videos and was the seventh most widely used one. To sum up, the word English was the most widely used one, followed by video, shorts, practice (sentence, word), and learning (vocabulary), in that order. It is worthwhile noting that the word Practice occurred in 10 YouTube videos, that the word conversation occurred in 9 YouTube videos, and that the word news appeared in seven YouTube videos. From all of this, it is evident that they are all necessary for English learning.
3.5. VISUALIZATION OF WORDS
The main goal of this section is to provide the visualization of which words are closely related to English learning. Figure 1 shows the visualization of words related to English learning:
Figure 1
Figure 1 Visualization of English Learning |
As exemplified in Figure 1, 26 words are closely related to one another. Words linked to English and learning are education, video, practice, word, speaking, news, class, study, vocabulary, lesson, sentence, etc. This in turn implies that they are closely related to English learning and important factors for it. For the visualization of synonyms, see Kang (2022a), Kang (2022b), Kang (2022c), Kang (2022d). To sum up, Figure 1 provides us with the picture of which factors are closely related to English learning.
4. CONCLUSION
To sum up, we have analyzed 80 YouTube videos in connection with English learning. In section 3.1, we have shown that the four-word expression has the highest frequency (159 tokens) and the highest proportion (0.14). In section 3.2, we have argued that YouTubers think of the so-called word as the most important keyword for English learning. In section 3.3, we have further argued that topic 10 was the most widely used by YouTubers, followed by topic 2 (topic 7), topic 5, and topic 9, in that order. In section 3.4, we have maintained that the word English was the most widely used one, followed by video, shorts, practice (sentence, word), and learning (vocabulary), in that order. In section 3.5, we have shown that the words education, video, practice, word, speaking, news, class, study, vocabulary, lesson, sentence, etc. are linked to English and learning. This in turn implies that they are indispensable factors for English learning.
CONFLICT OF INTERESTS
None.
ACKNOWLEDGMENTS
None.
REFERENCES
Kang, N. (2022a). A Comparative Analysis of Search for and Look for in Four Corpora. Advances in Social Sciences Research Journal 9 (3), 168-178. https://doi.org/10.14738/assrj.93.11980.
Kang, N. (2022b). A Comparative Analysis of Impressed by and Impressed with in Two Corpora. Theory and Practice in Language Studies 12 (5), 819-827. https://doi.org/10.17507/tpls.1205.01.
Kang, N. (2022c). On Speak to and Talk to : A Corpora-based Analysis. Theory and Practice in Language Studies 12 (7), 1262-1270. https://doi.org/10.17507/tpls.1207.03.
Kang, N. (2022d). On Speak with and Talk with: A Corpora-based Analysis. International Journal of Social Science and Human Research 5 (8), 3354-3360.
This work is licensed under a: Creative Commons Attribution 4.0 International License
© Granthaalayah 2014-2022. All Rights Reserved.