-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathindex.html
More file actions
417 lines (374 loc) · 31.6 KB
/
index.html
File metadata and controls
417 lines (374 loc) · 31.6 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
<!doctype html>
<html>
<head>
<meta charset="UTF-8">
<title>Food News Visualization</title>
<meta name="viewport" content="width=device-width,initial-scale=1.0">
<!-- Tabby -->
<link rel="stylesheet" href="/third_party_web/tabby-ui.min.css?v=0.1.4">
<!-- PyScript -->
<link rel="stylesheet" href="/third_party/core.css?v=0.1.4">
<script type="module" src="/third_party/core.js?v=0.1.4"></script>
<!-- Custom -->
<link rel="stylesheet" href="/css/web.css?v=0.1.4">
</head>
<body>
<py-config>
{
"packages": ["/third_party/sketchingpy-0.3.2-py3-none-any.whl"],
"files": {
"/article_preview_viz.pyscript?v=0.1.4": "article_preview_viz.py",
"/abstract.pyscript?v=0.1.4": "abstract.py",
"/article_getter.pyscript?v=0.1.4": "article_getter.py",
"/const.pyscript?v=0.1.4": "const.py",
"/data_util.pyscript?v=0.1.4": "data_util.py",
"/grid_viz.pyscript?v=0.1.4": "grid_viz.py",
"/map_viz.pyscript?v=0.1.4": "map_viz.py",
"/overview_viz.pyscript?v=0.1.4": "overview_viz.py",
"/selection_viz.pyscript?v=0.1.4": "selection_viz.py",
"/state_util.pyscript?v=0.1.4": "state_util.py",
"/table_util.pyscript?v=0.1.4": "table_util.py",
"/csv/articles.csv": "csv/articles.csv"
}
}
</py-config>
<nav>
<a href="#main" class="skip-link">Skip to content</a>
<h1><div class="highlight">Food News Viz</div></h1>
<div class="subtitle"><div class="highlight">Open source exploration of food in the news across the globe.</div></div>
<ul data-tabs>
<li><a data-tabby-default href="#introduction">Introduction</a></li>
<li><a href="#tutorial">Tutorial</a></li>
<li><a href="#app">App</a></li>
<li><a href="#express">Express</a></li>
<li><a href="#insights">Insights</a></li>
<li><a href="#method">Method</a></li>
<li><a href="#about">About</a></li>
</ul>
</nav>
<main id="main">
<section id="introduction">
<h2>Introduction</h2>
<div>
Informing a food justice project, this social listening effort asks what is part of the conversation and what is missing when the world discusses food. It asks questions like:
</div>
<ul>
<li>What are the commonly discussed food challenges?</li>
<li>What terms are used to describe common justice topics like food security?</li>
<li>What terms come up across different parts of the world when they talk about "good" food and what does that potentially tell us about the priorities or beliefs of those different regions?</li>
<li>How do governance, policy, and justice feature in food conversations and how does that vary geographically?</li>
<li>How can these data driven insights inform and inspire qualitative research?</li>
</ul>
<div>
Specifically, using news media as a lens, this natural language processing effort uses machine learning / artificial intelligence to read thousands of articles from across countries and languages to see how often different topics appear and in which context. Distilling this insight into interactive data visualizations, this website provides research tools which enable users to explore this global dataset.
</div>
<div class="cta">
<a class="button internal-link" href="#tutorial">Visualization</a>
<a class="button internal-link" href="#insights">Insights</a>
<a class="button internal-link" href="#method">Method</a>
</div>
</section>
<section id="tutorial">
<h2>Tutorial</h2>
<div class="padded-start">
Before getting started, this short video tutorial goes through an example analysis, showing how to use the interactive web-based visualization.
You can also <a class="internal-link" href="#app">skip to app</a> or <a class="internal-link" href="#insights">skip to insights</a>.
</div>
<iframe src="https://player.vimeo.com/video/906555824?h=9c82f7d79d&badge=0&autopause=0&player_id=0&app_id=58479" width="1600" height="900" frameborder="0" allow="autoplay; fullscreen; picture-in-picture" title="Food News Viz Tutorial"></iframe>
<div class="padded-start">
<em>Note: Some example articles are listed with URLs. These are provided as examples solely for the purpose of academic research. Rights retained by their authors.</em>
</div>
<div class="cta" id="tutorial-continue">
<a class="button internal-link" href="#app">Go to App</a>
<a class="button internal-link" href="#insights">Skip to Insights</a>
</div>
</section>
<section id="app">
<h2>Web Application</h2>
<div id="app-intro">
<div class="text">
This online interactive visualization allows users to explore the data in detail but may take a few moments to load for the first time. Some users such as those with slower internet connections, some mobile devices, or certain adaptive (accessibility) technologies may prefer the <a href="#express" class="internal-link">express version</a>. By continuing you agree to the site's terms as described in the <a href="#about">about section</a> which discusses open source, data license, privacy, and additional details.
</div>
<div class="cta">
<a class="button" id="open-web-app-button" href="#sketch">Open Web App</a>
</div>
</div>
<div id="sketch">
<div id="sketch-load-message">
Loading...
<progress id="sketch-load-progress" max="20" value="0"></progress>
</div>
<div>
<canvas id="sketch-canvas"></canvas>
</div>
</div>
</section>
<section id="express">
<h2>Express version</h2>
This alternative version of the <a href="#app" class="internal-link">web application</a> may work better for some mobile devices, slower internet connections, or certain adaptive (accessibility) technologies. Simply prepare a query below and indicate if you want a visualization or data export. By continuing you agree to the site's terms as described in the <a href="#about">about section</a> which discusses open source, data license, privacy, and additional details.
<section class="subsection">
<h3>Query</h3>
<form>
<div class="form-group">
<div><label for="country-select">Country (use "All" for no filter)</label></div>
<div>
<select id="country-select">
<option value="all" selected>All</option>
<option value="Australia">Australia</option>
<option value="Bangladesh">Bangladesh</option>
<option value="Bulgaria">Bulgaria</option>
<option value="Canada">Canada</option>
<option value="Germany">Germany</option>
<option value="Hungary">Hungary</option>
<option value="India">India</option>
<option value="Indonesia">Indonesia</option>
<option value="Ireland">Ireland</option>
<option value="Kenya">Kenya</option>
<option value="Lithuania">Lithuania</option>
<option value="Mexico">Mexico</option>
<option value="Nigeria">Nigeria</option>
<option value="Pakistan">Pakistan</option>
<option value="Philippines">Philippines</option>
<option value="Romania">Romania</option>
<option value="Russian Federation">Russian Federation</option>
<option value="Saudi Arabia">Saudi Arabia</option>
<option value="Singapore">Singapore</option>
<option value="South Africa">South Africa</option>
<option value="Spain">Spain</option>
<option value="Thailand">Thailand</option>
<option value="Turkey">Turkey</option>
<option value="United Arab Emirates">United Arab Emirates</option>
<option value="United Kingdom">United Kingdom</option>
<option value="United States">United States</option>
<option value="Virgin Islands">Virgin Islands</option>
</select>
</div>
</div>
<div class="form-group">
<div><label for="category-select">Category (use "All" for no filter)</label></div>
<div>
<select id="category-select">
<option value="all" selected>All</option>
<option value="economy and industry">economy and industry</option>
<option value="environment and resources">environment and resources</option>
<option value="food and materials">food and materials</option>
<option value="health and body">health and body</option>
<option value="people and society">people and society</option>
</select>
</div>
</div>
<div class="form-group">
<div><label for="tag-select">Tag (use "All" for no filter)</label></div>
<div>
<select id="tag-select">
<option value="all" selected>All</option>
<option value="access">access</option>
<option value="acid">acid</option>
<option value="acute health">acute health</option>
<option value="additives">additives</option>
<option value="age">age</option>
<option value="agency">agency</option>
<option value="agricultural materials">agricultural materials</option>
<option value="agriculture">agriculture</option>
<option value="aid">aid</option>
<option value="alcohol">alcohol</option>
<option value="allergy">allergy</option>
<option value="analysis and science">analysis and science</option>
<option value="artificial">artificial</option>
<option value="bathroom">bathroom</option>
<option value="beverage">beverage</option>
<option value="body">body</option>
<option value="business (general)">business (general)</option>
<option value="business (individual)">business (individual)</option>
<option value="cbd">cbd</option>
<option value="chemicals">chemicals</option>
<option value="chronic health">chronic health</option>
<option value="clothes">clothes</option>
<option value="commerce">commerce</option>
<option value="community">community</option>
<option value="conflict and defense">conflict and defense</option>
<option value="cooperation">cooperation</option>
<option value="crime">crime</option>
<option value="cutlery, storage, appliance">cutlery, storage, appliance</option>
<option value="delivery">delivery</option>
<option value="diet">diet</option>
<option value="drugs and medication">drugs and medication</option>
<option value="eating">eating</option>
<option value="economy">economy</option>
<option value="energy">energy</option>
<option value="environment">environment</option>
<option value="family">family</option>
<option value="food security">food security</option>
<option value="food service">food service</option>
<option value="food type">food type</option>
<option value="game and sport">game and sport</option>
<option value="geography">geography</option>
<option value="governance">governance</option>
<option value="grocery">grocery</option>
<option value="health (general)">health (general)</option>
<option value="healthcare">healthcare</option>
<option value="holiday">holiday</option>
<option value="hope">hope</option>
<option value="hospitality">hospitality</option>
<option value="house">house</option>
<option value="hunger">hunger</option>
<option value="industry">industry</option>
<option value="ingredients">ingredients</option>
<option value="judge">judge</option>
<option value="justice">justice</option>
<option value="kitchen and pantry">kitchen and pantry</option>
<option value="labor">labor</option>
<option value="leadership">leadership</option>
<option value="legal">legal</option>
<option value="love">love</option>
<option value="meal">meal</option>
<option value="media">media</option>
<option value="mental health">mental health</option>
<option value="metal">metal</option>
<option value="money">money</option>
<option value="nature">nature</option>
<option value="nutrition">nutrition</option>
<option value="other producer">other producer</option>
<option value="people (general)">people (general)</option>
<option value="people (individual)">people (individual)</option>
<option value="people (types)">people (types)</option>
<option value="pets">pets</option>
<option value="plastic">plastic</option>
<option value="poison">poison</option>
<option value="policy">policy</option>
<option value="political action">political action</option>
<option value="poverty">poverty</option>
<option value="price">price</option>
<option value="processing">processing</option>
<option value="recipe">recipe</option>
<option value="reliance">reliance</option>
<option value="religion">religion</option>
<option value="resilience">resilience</option>
<option value="rest">rest</option>
<option value="risk">risk</option>
<option value="salt">salt</option>
<option value="sexual health">sexual health</option>
<option value="spice">spice</option>
<option value="stamps and vouchers">stamps and vouchers</option>
<option value="struggle">struggle</option>
<option value="sugar and sweeteners">sugar and sweeteners</option>
<option value="supplements">supplements</option>
<option value="supplies">supplies</option>
<option value="taste">taste</option>
<option value="tea and coffee">tea and coffee</option>
<option value="tech">tech</option>
<option value="travel, transport, logistics">travel, transport, logistics</option>
<option value="waste">waste</option>
<option value="water">water</option>
<option value="weather and disaster">weather and disaster</option>
<option value="writing / mail">writing / mail</option>
</select>
</div>
</div>
<div class="form-group">
<div><label for="keyword-input">Keyword (use "all" or empty for no filter)</label></div>
<div><input type="text" id="keyword-input" value="all"></div>
</div>
<div class="form-group">
<div><label for="dimension-select">Report</label></div>
<div>
<select id="dimension-select">
<option value="country">Country Statistics (% of Country Articles match Query)</option>
<option value="category">Category Statistics (% of Articles in Query with Category)</option>
<option value="tag" selected>Tag Statistics (% of Articles in Query with Tag)</option>
<option value="keyword">Keyword Statistics (% of Articles in Query with Keyword)</option>
</select>
</div>
</div>
<div class="form-group">
<button id="execute-express-button">Execute</button>
</div>
</form>
</section>
<section class="subsection">
<h3>Results</h3>
<div id="express-loading">Please wait...</div>
<div id="express-report"></div>
</section>
</section>
<section id="insights">
Our insights document is available as Google Slides and hosted off-site.
<div class="cta">
<a class="button" href="https://docs.google.com/presentation/d/1WZCUKordfvyywiK_68qgxslBUtKSVTxSmvbZwY1leAc/edit?usp=drive_link">Open Slides</a>
</div>
</section>
<section id="method">
<h2>Method</h2>
This <a href="https://medium.com/ideo-colab/natural-language-processing-a-new-way-to-listen-to-people-about-health-8cfe9bece1ce">natural language processing</a> project utilizes a number of technologies to source data and then produce an interactive visualization:
<ul>
<li>News sources are queried through <a href="https://newsdata.io/">Newsdata.io</a> using a filter for articles discussing food.</li>
<li>Search parameters are converted to each article's language using <a href="https://aws.amazon.com/translate/">Amazon Translate</a>, the same automated system used to convert all article information to a single common language (English: the most common article original language) before additional processing.</li>
<li>This application determines the topics of an article by having a machine read and catagorize their contents. This happens through either Method A or Method B.</li>
<li>Method A: In an algorithm called <a href="https://dl.acm.org/doi/10.5555/944919.944937">LDA</a>, the computer examines how often different words appear in articles in general as well as how often they appear together, creating a sense for which words may be in the same topic (like dog and cat) and, thus, which topics a document contains.
<li>Method B: After filtering commonly occuring words that may be less useful in determining article topic (<a href="https://www.learndatasci.com/glossary/tf-idf-term-frequency-inverse-document-frequency/">TF-IDF</a>), the words of an article are converted to numbers describing their meaning (<a href="https://towardsdatascience.com/word2vec-explained-49c52b4ccb71">Word2Vec</a>) where dog and cat may have more similar numbers than dog to truck. Finally, the computer clusters together similar words or, put another way, groups words with similar numbers (<a href="https://link.springer.com/chapter/10.1007/978-3-642-37456-2_14">HDBSCAN</a>).</li>
<li>The team double checks and, when necessary, performs minor refinements of the resulting topics before performing "<a href="https://www.nngroup.com/articles/affinity-diagram/">affinity diagramming</a>" where topics are clustered further into a smaller number of "categories" as shown in the visualization.</li>
<li>A visualization using <a href="https://developers.google.com/public-data/docs/canonical/countries_csv">country centroids</a> is built using a technology called <a href="https://sketchingpy.org/">Sketchingpy</a> and deployed to the web.</li>
</ul>
This research effort provides a formal description of these methods along with evaluation information and limitations in a <a href="/img/whitepaper.pdf">supplemental whitepaper</a>. The application currently uses Method B. The <a href="https://github.com/SchmidtDSE/gafj-viz">visualization code</a> and <a href="https://github.com/SchmidtDSE/gafj-pipeline">data pipeline code</a> are both open source where the linked repositories provide additional details on open source technologies used. Current coverage is from Dec 2022 to Nov 2023. See the <a href="#about">about section</a> for open source, additional conditions, and further details including dataset download links. Finally, note that there was geographic bias in availability of article content so analysis only uses titles for equity reasons.
</section>
<section id="about">
<h2>About</h2>
This is an open source academic research project.
<section class="subsection">
<h3>Data license and open source</h3>
Code available under an <a href="https://github.com/SchmidtDSE/gafj-viz/blob/main/LICENSE.md">open source license</a>. The project makes the following repositories available:
<ul>
<li><a href="https://github.com/SchmidtDSE/gafj-viz">Visualization</a></li>
<li><a href="https://github.com/SchmidtDSE/gafj-pipeline">Pipline</a></li>
</ul>
Original publisher retains copyright to article content and some metadata including title. Please ensure you have rights or fair use to use the materials given the specifics of derivative work. <a href="/csv/articles.csv">Data available for download</a> under the <a href="https://creativecommons.org/licenses/by-nc/4.0/">CC-BY-NC License</a>. The full data download is only intended for academic research and, by downloading it, you agree to use it only for academic purposes. See provider <a href="https://newsdata.io/">Newsdata.io</a> for more information. By using these data you agree they are made available without any warranty of any kind.
</section>
<section class="subsection">
<h3>Credits</h3>
Collaboration between the <a href="https://dse.berkeley.edu/">Eric and Wendy Schmidt Center for Data Science and Environment</a> and the <a href="https://futureoffood.org/">Global Alliance for the Future of Food</a>. See <a href="/humans.txt">humans.txt</a> for more details and full credits. Open source libraries used are available in <a href="https://github.com/SchmidtDSE/gafj-viz/blob/main/README.md">visualization README</a> and <a href="https://github.com/SchmidtDSE/gafj-pipeline/blob/main/README.md">data pipeline README</a>.
</section>
<section class="subsection">
<h3>Privacy</h3>
This application records standard server access logs for security and stability reasons, including to prevent abuse. This is a common practice employed by many websites necessary for maintaing application function. This information includes:
<ul>
<li>IP addresses which describe from which location and internet connection the application is accessed.</li>
<li>User agent strings which provide basic information about the device and software ("browser") used to access the site.</li>
<li>The requested URLs which are unique identifiers for data and other resources requested from our servers.</li>
</ul>
These logs are maintained for security reasons and to ensure strong application performance but are anonymized within 7 days of collection. These anonymized statistics are maintained indefinently as aggregated information for us to understand usage patterns in order to improve the service and maintain its performance. The anonymized data may also include error or bug reports. That said, and no other presonal identifying information is collected including email address, phone number, race, ethnicity, age, etc. Furthermore, please note that:
<ul>
<li>Personally identifying information is never shared or sold.</li>
<li>Anonymized data do not include specific IP addresses.</li>
<li>IP address is not retained past 7 days except for security / abuse prevention purposes if potential unusual behavior is deteced like in the case of a large number of requests.</li>
<li>No information we collect is used for advertising.</li>
<li>Though we do not ask users their age to respect their privacy, this webpage is intended for those 18 years and older.</li>
<li>This website does not use cookies.</li>
<li>Your device may "cache" parts of the application "locally" on your machine for performance reasons as dictated by your browsesr settings.</li>
</ul>
See <a href="https://www.dreamhost.com/legal/customer-data-processing-addendum/">DreamHost CDPA</a> for more information about our web host / subprocessor. Anonymized bug reports may be managed with with <a href="https://sentry.io">Sentry</a>. Furthermore, video hosting is provided by Vimeo and traffic may be generated against their servers under the <a href="https://vimeo.com/terms">Vimeo terms</a>. For users who elect to navigate off our site to review the insights doc, see the <a href="https://policies.google.com/terms?hl=en">Google Terms of Service</a>. Finally, use of the "express" version also generates web traffic with AWS as described in the <a href="https://docs.aws.amazon.com/whitepapers/latest/navigating-gdpr-compliance/aws-data-processing-addendum-dpa.html">Amazon Web Services DPA</a>. This page was last updated 2024-01-19 and may be revised in the future.
</section>
<section class="subsection">
<h3>Security</h3>
We take security seriously. Communication between your device and our servers is encrypted with secure socket layer SSL. Access to non-anonymized logs and the configuration / code for the application is limited to the current maintainers of the project, automated systems we've constructed for running the application, and our subprocessors. Note that anonymized data may be shared with project partners for the purposes of tracking project success and impact.
</section>
<section class="subsection">
<h3>Rights and warranty</h3>
By using these data and this website you agree to the privacy and security terms above. Furthermore, you agree that this application and its data are provided without any warranty of any kind. Use at your own risk and ensure you have appropriate rights to or fair use of the underlying content if appropriate when creating derivative products.
</section>
</section>
</main>
<footer>
<div><a href="#about" class="internal-link">Privacy and About</a> / <a href="#app" class="internal-link">In-Browser Visualization</a> / <a href="#express" class="internal-link">Mobile and Accessibility Optimized</a>
</div>
<div>
Collaboration of <a href="https://dse.berkeley.edu/">University of California Berkeley Schmidt DSE</a> and <a href="https://futureoffood.org/">GAFF</a> - for academic research purposes only
</div>
</footer>
</body>
<!-- Third party -->
<script src="/third_party_web/tabby.polyfills.min.js?v=0.1.4"></script>
<script src="/third_party_web/d3.min.js?v=0.1.4"></script>
<!-- Support -->
<script src="/js/main.js?v=0.1.4"></script>
<script src="/js/express.js?v=0.1.4"></script>
</html>