<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:cc="http://cyber.law.harvard.edu/rss/creativeCommonsRssModule.html">
    <channel>
        <title><![CDATA[Stories by Lilian H on Medium]]></title>
        <description><![CDATA[Stories by Lilian H on Medium]]></description>
        <link>https://medium.com/@lilianhj?source=rss-b6160498d99d------2</link>
        <image>
            <url>https://cdn-images-1.medium.com/fit/c/150/150/1*dmbNkD5D-u45r44go_cf0g.png</url>
            <title>Stories by Lilian H on Medium</title>
            <link>https://medium.com/@lilianhj?source=rss-b6160498d99d------2</link>
        </image>
        <generator>Medium</generator>
        <lastBuildDate>Sat, 16 May 2026 22:57:24 GMT</lastBuildDate>
        <atom:link href="https://medium.com/@lilianhj/feed" rel="self" type="application/rss+xml"/>
        <webMaster><![CDATA[yourfriends@medium.com]]></webMaster>
        <atom:link href="http://medium.superfeedr.com" rel="hub"/>
        <item>
            <title><![CDATA[Continued Bot Infiltration of Trump’s Facebook Pages]]></title>
            <link>https://medium.com/data-for-democracy/continued-bot-infiltration-of-trumps-facebook-pages-2df82ca86b5b?source=rss-b6160498d99d------2</link>
            <guid isPermaLink="false">https://medium.com/p/2df82ca86b5b</guid>
            <dc:creator><![CDATA[Lilian H]]></dc:creator>
            <pubDate>Mon, 01 May 2017 17:01:02 GMT</pubDate>
            <atom:updated>2017-05-01T17:01:02.673Z</atom:updated>
            <content:encoded><![CDATA[<p><strong><em>Researched and written by C.E. Carey</em></strong></p><p>This past Thursday, Facebook released a <a href="https://fbnewsroomus.files.wordpress.com/2017/04/facebook-and-information-operations-v1.pdf">white paper</a> which “does not contradict” the <a href="https://www.intelligence.senate.gov/sites/default/files/documents/ICA_2017_01.pdf">U.S. intelligence community’s assessment</a> that its platform was used to carry out influence operations leading up to the 2016 election. Though much has been written about the role of <a href="https://www.nytimes.com/2016/11/18/technology/automated-pro-trump-bots-overwhelmed-pro-clinton-messages-researchers-say.html">bots</a> and <a href="http://www.thedailybeast.com/articles/2017/03/30/russia-s-info-war-on-the-u-s-started-in-2014.html">influence campaigns</a> — both <a href="http://www.reuters.com/article/us-usa-russia-election-exclusive-idUSKBN17L2N3">foreign</a> and <a href="http://www.newyorker.com/magazine/2017/03/27/the-reclusive-hedge-fund-tycoon-behind-the-trump-presidency">domestic</a> — in the election, it’s unclear what role — if any — such campaigns may play post-election. Have the bots stuck around to continue pushing their messages? Have they disappeared? Have they moved on to other <a href="http://www.politico.com/magazine/story/2017/01/why-russia-loves-the-idea-of-california-seceding-214632">causes</a> and <a href="http://www.reuters.com/article/us-france-security-facebook-idUSKBN17F25G">upcoming elections</a>? Have new “<a href="http://blogs.discovermagazine.com/d-brief/2017/01/20/twitter-bot-army/">bot armies</a>” taken their place?</p><p>Our group previously provided evidence of a <a href="https://medium.com/data-for-democracy/sockpuppets-secessionists-and-breitbart-7171b1134cd5">coordinated social media influence campaign</a> during the 2016 election season marked by sudden, simultaneous shifts in language across multiple platforms, including Facebook. For those analyses, we identified nearly 30,000 accounts posting duplicate content (ex: multi-word comments each identical to at least 4 others) on <a href="https://www.facebook.com/DonaldTrump">Donald Trump’s official Facebook page</a> and <a href="https://www.facebook.com/donaldtrumppresident/">a fan-created page</a> during the campaign. We considered this pattern of posting behavior to be consistent with that of automated accounts, or <em>bots</em>. When repeating this process using comments from both before and after the election and inauguration (July 14, 2015, to April 9, 2017), we once again identified roughly 30,000 likely bots — 32,003, to be exact.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*1CeB6uTu4XNeQqDH." /></figure><p>These updated analyses indicate that many campaign-era bots remain in “maintenance mode” today, continuing to post somewhat regularly and ready to pounce whenever a particular administration action piques their interest. New bot networks have also been deployed in response to specific events since the election. For the time being, bots continue to be actively involved in the social media ecosystem and, without intervention on the part of the platforms, show no signs of letting up anytime soon.</p><p>(Throughout this article, we refer to any account which posted at least one 10+ token comment identical to 4+ others in our dataset as a “bot.” However, this method of bot identification is far from perfect; for example, any human-operated accounts copying and pasting “viral” messages would be classified as a bot for our purposes, and more sophisticated automated accounts that generate unique content for each post would not be caught by our somewhat crude classification system.)</p><h3>The Election</h3><p>Bots were most active on Trump’s Facebook pages right before the election, with nearly 15,000 active in the 30 days leading up to November 8, 2016. Though bots made up only 3% of active accounts, they posted 14% of all comments during this time.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1000/0*JQS2dhI917XBDo7E." /></figure><p>This pre-election spike in bot activity mirrored an overall increase in posting, even among non-bot accounts. However, the jump in bot activity happened earlier and was more sustained, whereas non-bot activity increased rapidly in the days just prior to the election. Both peaked on November 9th, the day after the election.</p><p>Immediately after the election, posting levels dropped across-the-board, for both bot and non-bot accounts. Of the over 25,000 bot accounts active before the election, 44% continued posting after November 9, 2016.</p><h3>The First 100 Days</h3><p>During the first 100 days of Donald Trump’s presidency (although technically, only the first 80 days were included in our dataset), 32% of bots active pre-election continued to post — conversely, 63% of bots active post-inauguration were active pre-election. Bots generally remained active at post-election, pre-inauguration levels, posting roughly 12% of comments per day. Non-bot activity, on the other hand, trended downward over time.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*fPFVBP3-TapnUIkV." /></figure><p>Relative daily posting activity tracked similarly across both groups, likely a result of all comments being posted in response to page posts. For example, if a page produces 5 posts one day, and only 1 post another day, the day with 5 posts would present more opportunity for bots and non-bots alike to comment.</p><p>However, several notable differences in activity occurred. For example, during the first 2 weeks of the Trump presidency, 3 events provoked divergent responses from bot vs. non-bot accounts:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1000/0*Fe0_xefacKLhMDeV." /></figure><p>Non-bot accounts showed a dramatic spike in posting behavior on January 20, 2017 — the day of President Trump’s inauguration. Though bots also increased activity relative to the preceding days, the spike was nowhere near as drastic. Conversely, a spike in bot activity occurred on January 27, 2017 — the day President Trump signed an executive order halting immigration and refugee resettlement from 7 Muslim-majority nations. Whereas non-bot accounts posted substantially more comments on Inauguration Day than on the day of the so-called “Muslim ban,” for bots this pattern was reversed. Finally, a spike in non-bot commenting on January 31, 2017 — the day President Trump nominated (then-) Judge Neil Gorsuch to the US Supreme Court — was not accompanied by as large an increase in bot activity.</p><p>These different patterns of posting suggest that bots and non-bots have different “interests:” bots were <em>less </em>interested in the pomp and circumstance of the 58th US Presidential Inaugural Ceremonies and nomination of a traditionally conservative judge to the Supreme Court, while relatively <em>more</em> interested in an executive action affecting foreign relations, than non-bots.</p><h4>The Syria Strike Spike</h4><p>Post-election, April 7, 2017 — the date of the Syria strike — was the most active day for bots, with just under 2500 bot accounts posting over 6500 messages. On that day alone, bots made up 12% of active users and nearly one in four comments. Though posting by non-bots increased as well, it was not nearly to the same degree as the increase in bot posts.</p><p>Looking into the content of the bot posts, almost all were, predictably, about the Syria strike. Interestingly, more than half of the active bot accounts posted one (or more) of 3 almost identical messages, which in total made up just under half of all bot-posted messages that day:</p><blockquote>These blows increase our determination, strength and will, and you are still condemning your great lie by calling it a revolution. Israel has increased its frustration with the strength of the Syrian Arab army, which is fighting on all fronts at home and this border is Syria, with its strength and successful leadership, the people are growing in love with its army and leadership. We resist and resist because we are right owners no matter how you and all the traitors and those who call themselves kings We will not surrender here the Syrian Arab Army. Here are the protectors of the homeland. <strong>[Posted 1500+ times]</strong></blockquote><blockquote>We Syrian people denounce the attack on Our air base, which has always played a major role in the fight against terrorism. <strong>[Posted 1000+ times]</strong></blockquote><blockquote>Trump you exceeded your limits in this strike American Zionist Saudi Arabia. We will tell you that the Syrian Arab Army is the myth and master of this time are the protectors of the homeland and its immune system. The reply is coming. Here is Syria Assad. <strong>[Posted 350+ times]</strong></blockquote><p>Of the bot accounts which posted those messages, only 8 (less than 1%) had posted prior to the election. Only 2% had posted prior to April 7, 2017. In contrast, of the bot accounts which posted <em>other</em> messages, 1 in 5 were active pre-election, and two-thirds were active prior to April 7. This suggests that additional bots may have been deployed to push pro-Assad messages in response to the Syria strikes. It’s worth noting that even when discounting messages from these “new” accounts, the remaining posts — greater than 3500 in all — still represent one of the highest single-day post volumes post-election.</p><h4>Dueling Botnets</h4><p>Nearly 1 in 3 bot accounts active during the early days of the Trump administration had not posted prior to the election. Some appear to have been activated immediately post-election, while yet another group of roughly equal size was deployed post-inauguration. Both of these groups have maintained a stable presence since then, with the exception of the day of the Syria strike. As mentioned earlier, the surge in active bot accounts that day likely represented an independent “bot attack,” but for visualization purposes, they are counted as part of the “post-inauguration” bot group.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*fWIj5R7MMypxJHmq." /></figure><p>This observation raises the possibility that separate actors may be controlling independent “bot armies,” each with specific — and perhaps conflicting — interests and intentions. And even as pre-election bots drop off, more may step in to take their place.</p><h3>What’s Next?</h3><p>Civic engagement with our elected officials, including the president, as well as other interested citizens is a critical component of democracy and US government. The Internet and social media were supposed to usher in an era of “<a href="https://www.edge.org/3rd_culture/sanger07/sanger07_index.html">democratization of knowledge</a>,” wherein everyday citizens — not just elected officials and those with special access — could have unprecedented access to information about the inner workings of government and be able to discuss policy with others across the country — and the world.</p><p>Then came the 2016 election, in which Donald Trump surprisingly (at least to most <a href="http://www.politico.com/story/2016/11/election-results-2016-clinton-trump-231070">political pundits</a>, <a href="http://www.cnn.com/videos/politics/2016/11/12/pollster-eats-bug-after-trump-win-smerconish.cnn">pollsters</a>, and <a href="http://www.newseum.org/todaysfrontpages/?tfp_display=archive-date&amp;tfp_region=USA&amp;tfp_sort_by=state&amp;tfp_archive_id=110916&amp;tfp_show=all">legacy media outlets</a>) overtook frontrunner Hillary Clinton in the Electoral College to become the 45th President of the United States. Countless election “<a href="https://www.brookings.edu/blog/fixgov/2016/11/16/choose-your-own-post-mortem-part-1/">postmortems</a>” have focused on the role of “<a href="http://www.cbsnews.com/news/how-fake-news-find-your-social-media-feeds/">fake</a> <a href="http://www.metrotimes.com/news-hits/archives/2017/04/04/study-says-michigan-was-bombarded-with-fake-news-during-2016-campaign">news</a>” <a href="http://nymag.com/scienceofus/2016/11/how-facebook-and-the-filter-bubble-pushed-trump-to-victory.html">and</a> <a href="https://www.nytimes.com/2016/11/14/technology/facebook-is-said-to-question-its-influence-in-election.html">social</a> <a href="https://news.vice.com/story/journalists-and-trump-voters-live-in-separate-online-bubbles-mit-analysis-shows">media</a> in influencing public opinion. In these less rosy takes, the “democratization of knowledge” has also increased public access to <a href="http://www.pnas.org/content/113/3/554.full"><em>mis</em>information</a>, often pushed by “users” <a href="https://www.intelligence.senate.gov/sites/default/files/documents/ICA_2017_01.pdf">who don’t have America’s best interests in mind</a>.</p><p>Moreover, the massive prevalence of <a href="http://firstmonday.org/ojs/index.php/fm/article/view/7090/5653a">automated content</a> and content posted by disingenuous actors (see also <a href="https://www.wired.com/2014/07/virtual-unreality-the-online-sockpuppets-that-trick-us-all/">sockpuppets</a> and <a href="http://www.politico.com/magazine/story/2017/03/memes-4chan-trump-supporters-trolls-internet-214856">trolls</a>) violates an implicit compact users make when they submit their personal information to social media platforms: that they will be engaging with other actual <em>people</em>. When Facebook users visit an official or fan-created page of their president, they expect to receive updates about his actions and interact with other users — positively or negatively, civilly or less-than-civilly — in response. If over 1 in 10 comments on a typical day are the work of automated accounts, however, <a href="http://www.pewinternet.org/2017/03/29/the-future-of-free-speech-trolls-anonymity-and-fake-news-online/">the opportunity for discourse collapses</a>.</p><p>In the months since the election, Facebook has taken multiple steps to combat <a href="https://newsroom.fb.com/news/2016/12/news-feed-fyi-addressing-hoaxes-and-fake-news/">fake news</a> and police <a href="https://www.facebook.com/notes/facebook-security/improvements-in-protecting-the-integrity-of-activity-on-facebook/10154323366590766">fake accounts</a>. On the fake news front, they’ve begun <a href="https://www.recode.net/2017/3/4/14816254/facebook-fake-news-disputed-trump-snopes-politifact-seattle-tribune">flagging “disputed” news stories</a>, <a href="https://techcrunch.com/2017/04/06/facebook-puts-link-to-10-tips-for-spotting-false-news-atop-feed/">posted tips on how to spot fake news</a>, and even taken out full-page print ads in newspapers in <a href="https://techcrunch.com/2017/04/14/facebook-runs-full-page-newspaper-ads-against-fake-news-in-france-ahead-of-the-election/">France</a> and <a href="http://fortune.com/2017/04/14/facebook-fake-news-germany/">Germany</a>. Regarding fake accounts, they recently shut down <a href="http://www.reuters.com/article/us-france-security-facebook-idUSKBN17F25G">30,000 accounts in France</a> in advance of the presidential election and disrupted an <a href="https://www.facebook.com/notes/facebook-security/disrupting-a-major-spam-operation/10154327278540766/">international spam operation</a>. Notably, these Facebook-initiated anti-bot measures took place directly after the period covered in our analyses of Trump’s Facebook pages. It remains to be seen whether they will have any impact site-wide, or whether fake accounts are <a href="http://www.nbcnews.com/tech/security/crackdowns-social-media-accounts-backfire-driving-demand-n746841">so embedded in the social media economy</a> that their continued use is all-but-inevitable.</p><p>Nonetheless, one thing’s for certain: they’re not going away on their own.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=2df82ca86b5b" width="1" height="1" alt=""><hr><p><a href="https://medium.com/data-for-democracy/continued-bot-infiltration-of-trumps-facebook-pages-2df82ca86b5b">Continued Bot Infiltration of Trump’s Facebook Pages</a> was originally published in <a href="https://medium.com/data-for-democracy">Data for Democracy</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Hackathon Spotlight #1: Linking lobbyists and Chicago legislators with data]]></title>
            <link>https://medium.com/data-for-democracy/hackathon-spotlight-1-linking-lobbyists-and-chicago-legislators-with-data-290e8435b0cb?source=rss-b6160498d99d------2</link>
            <guid isPermaLink="false">https://medium.com/p/290e8435b0cb</guid>
            <category><![CDATA[chicago]]></category>
            <category><![CDATA[hackathons]]></category>
            <category><![CDATA[civictech]]></category>
            <category><![CDATA[data-science]]></category>
            <category><![CDATA[open-data]]></category>
            <dc:creator><![CDATA[Lilian H]]></dc:creator>
            <pubDate>Mon, 17 Apr 2017 18:01:02 GMT</pubDate>
            <atom:updated>2017-04-17T20:25:17.342Z</atom:updated>
            <content:encoded><![CDATA[<h4>A collaboration between Data for Democracy and data.world</h4><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*juodUd0zuF9x5QhULfip9g.png" /><figcaption>Our final visualization. Read on to see how we made it!</figcaption></figure><h3>Bribery, extortion, and wire fraud.</h3><p>Those are the charges filed against Willie Cochran, a Chicago alderman <a href="http://bigstory.ap.org/article/06ce199233964753ac05252e4b9055d1/chicago-alderman-facing-corruption-charges-due-court">accused of stealing</a> “at least $30,000 from a charitable fund for poor constituents.” Cochran is the latest in a long line of Chicago officials indicted for corruption charges — over 1,000 since 1973.</p><p>Pretty stiff competition to get into the Chicago <a href="http://www.chicagotribune.com/news/local/breaking/ct-chicago-convicted-aldermen-htmlstory.html">Hall of Shame</a>.</p><p>So, how does an interested citizen stay informed about their local representatives?<strong> </strong>It seems impossible to keep track of legitimate political contributions, let alone the shady ones.</p><p>But, what if you <em>could</em> trace the money, based on publicly available data?</p><h3>Enter Data for Democracy and data.world.</h3><p>We at <a href="http://datafordemocracy.org/">Data for Democracy</a> are a group of 1,300 self-organized volunteers who work with data in order to improve the global community. D4D uses <a href="https://data.world/">data.world</a>, the social network for data people, to discover and explore data while collaborating on data projects.</p><p>Last weekend, hundreds of volunteers met up in major cities across the country (and remotely) to participate in the first <a href="https://medium.com/data-for-democracy/data-for-democracy-hackathon-happening-this-weekend-d3d694c1d966">D4D hackathon</a>. We worked on dozens of civic projects, leveled up our data skills, and ate our weight in pizza (or, at least we did in Chicago).</p><h3>Here’s what our group did in less than 24 hours together, and all the data you need to dig in yourself.</h3><h4>It all started with an idea Stephanie floated before the hackathon:</h4><blockquote>“We could link alder-people by caucus or other ties, see their connections on the graph through politics and through lobbyist connections…maybe something in Shiny app form that people can play with… a network igraph would be pretty rad,” <em>she mused on the #city-chicago Slack channel.</em></blockquote><p>We considered the possibilities — what if we could link individual contributions to an alderman’s voting records? Could we track which special interests were influencing votes, understand what, when, and how much lobbyists were paying caucuses, <em>and</em> find a way to to put it all out in the open? That <em>would</em> be pretty rad.</p><p>So, we set out to answer this question:</p><h3>How does money make its way from a business or special interest into the hands of Chicago’s elected officials?</h3><h4>Step 1: Get the data</h4><p>We started with the City of Chicago’s open data portal. A <a href="https://data.cityofchicago.org/browse?q=lobby&amp;sortBy=relevance&amp;utf8=%E2%9C%93">quick search</a> yielded a number of relevant datasets, including a list of all registered lobbyists since 2012 and their political contributions. It (almost) sounded like the work was already done!</p><p>But, as we explored the data, it wasn’t all sunshine and rainbows. Client industries were classified haphazardly, column definitions weren’t always clear, and tracing lobbyists to legislators would require joining multiple files. That meant we’d have to download each dataset (breaking the provenance chain), and create new files from them. We wouldn’t be able to share our findings or process <em>alongside the data</em>, making it more difficult for others to benefit from our analysis.</p><p>That’s where data.world came in. While preparing for the hackathon, we imported key datasets from the City of Chicago portal and other sources. We gathered data about <a href="https://data.world/search?q=chicago+lobbying&amp;tags=chicago+lobbying">Chicago lobbyists</a> as well as a <a href="https://data.world/sharon/chicago-data">master dataset</a> with links for each of the Chicago hackathon projects. This gave participants a single destination for all the relevant sources:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*c8-OgQqdpb0Qhn-BfIWklg.png" /></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*Q8lpPvGy2ARpxA3JiCJn5w.png" /><figcaption>A one-stop-shop for Chicago data on <a href="https://www.data.world">data.world</a></figcaption></figure><h4>Step 2: Get the people</h4><p>On Friday night, we all met in person for the first time. After weeks of chatting over Slack about potential data projects, scrutinizing our hackathon playlist, and debating who makes the best pizza in Chicago, we were finally ready to hack all the things!</p><p>Each lead pitched their idea to the group, and our 35 local participants broke out into three projects:</p><ul><li><strong>Chicago EPA </strong>started investigating Chicago’s water quality in light of <a href="http://www.chicagotribune.com/news/local/breaking/ct-trump-epa-chicago-impact-met-20170316-story.html">funding cuts to programs that clean up the Great Lakes</a>.</li><li><strong>Bolsa Familia </strong>got to work analyzing the <a href="https://en.wikipedia.org/wiki/Bolsa_Fam%C3%ADlia">Brazilian social welfare program</a>and its relationship to the 2014 election of Dilma Rousseff.</li><li>And of course, our project, <strong>Chicago lobbying</strong>, set out to connect the dots between special interests, lobbyists, and Chicago aldermen.</li></ul><h4>Step 3: Get to work</h4><p>Once we split into groups, Team ChiLo got down to business. We white-boarded out the problem and broke it down into tasks. We dug into the data on <a href="https://data.world/sharon/chicago-data">data.world</a> and researched <a href="https://www.buzzfeed.com/johntemplon/help-us-map-trumpworld">visualizations</a> similar to what we envisioned for our end result.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*OvKUjmex5fwtrs7R0zw48Q.jpeg" /></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*PXeMVzfW6aXVs5J2St2GlQ.jpeg" /><figcaption>Team ChiLo in action at the <a href="https://www.thisismetis.com/">Metis Data Science Bootcamp</a> space!</figcaption></figure><p>With only 10 hours together in person, our task list felt daunting. For the weekend, we decided to limit our scope to Chicago’s 50 aldermen instead of all elected officials.</p><p>We got to work cleaning and joining datasets in the data.world query tool (with the help of <a href="https://meta.data.world/advancedsql-62a01316de3a#.83po2oq1b">this handy tutorial</a>), classifying the funding sources by industry, and joining alderman voting records with legislation details. We wrangled the data with a lot of R, some Python, and even good old-fashioned Excel. People of all skill levels contributed, using their tool of choice.</p><p>Of course, there were ups and downs. Early on, we were excited to find unique IDs for lobbyists…that turned out to <em>not</em> be unique…that after more investigation we decided might actually be unique. We had to make a last-minute decision to constrain the time period to fewer years than we had hoped to analyze. But we persevered, and by half-way through the first night, we had enough data cleaned and transformed to spin off a dedicated <a href="https://data.world/lilianhj/chicago-lobbyists">Chicago lobbyists dataset</a>:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*l5fiu9mJzCLhOZziFW0SZg.png" /><figcaption>Check out the <a href="https://data.world/lilianhj/chicago-lobbyists">Chicago lobbyists dataset →</a></figcaption></figure><p>We added file descriptions and column definitions so others could understand the variables:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*-eXRcLB9by98LnGNIttfOA.png" /><figcaption>data.world automatically aggregated descriptions from every file in the dataset, along with each column type.</figcaption></figure><p>Using the built-in <a href="https://docs.data.world/tutorials/dwsql/">query editor</a>, we joined three files to combine data about aldermen, lobbyists, and clients into a single table:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*xb8Z-KTt6L8GA-7plEFlcw.png" /><figcaption><a href="https://data.world/lilianhj/chicago-lobbyists/query/f474fec3-1d2b-4a8c-acff-d2ebeed6d905">Dig into this query for yourself →</a></figcaption></figure><p>We exported that query and added it to the dataset as a new CSV. Opening the file full-screen, we got a good look at the data alongside the column metadata and visualized the data by selecting different columns to chart. Apparently, the most contributions come from the hospitality industry — over twice the amount of the next-highest sector (technology). Who knew!</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*YcToU0LIsGlzFGIfmlrhEg.png" /></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*7uMVtN6tcBUVt_2OKLfEFA.png" /><figcaption><a href="https://data.world/lilianhj/chicago-lobbyists/file/client-lobbyist-alderman.csv">Explore this file →</a></figcaption></figure><h4>Step 4: Make it visual</h4><p>By the next morning, we were ready to start on our showcase visualization. We settled on a network graph with a node for each funding source (orange), lobbyist (light blue), and alderman (dark blue). You can filter by alderman and see contribution amounts by hovering over each connection.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*O_b71J2wXCbrSI3NwjcrSA.png" /><figcaption>A picture’s worth a thousand words, but an animation’s worth a million. Check out <a href="https://nathanielwroblewski.github.io/data-for-democracy-2017/">the interactive version →</a></figcaption></figure><p>By the end of the hackathon, we had racked up two <a href="https://data.world/stephen-hoover/chicago-city-council-votes">new</a><em> </em><a href="https://data.world/lilianhj/chicago-lobbyists">datasets</a>, a bunch of <a href="https://github.com/skirmer/shinyapp">R scripts</a>, some new SQL knowledge <em>(well, for a lot of us!)</em>, and a kick-ass <a href="https://nathanielwroblewski.github.io/data-for-democracy-2017/">D3 visualization</a>. Really, something for everyone.</p><p>Plus we met some fellow data nerds, which was (at least!) half the fun.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*2NWNtEOhQHA1zMGYPAvazw.jpeg" /></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*Yqy2OTAaV_NNMoxXIsD1jg.jpeg" /></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*f8Y3Ske49IJhZYkq5dTgZA.jpeg" /><figcaption>From L to R: Lilian and Nate working on the visualization; Lilian and Stephanie with their matching Sparkle stickers; and the Chicago gang that stuck it out til the end of two rigorous days of hacking!</figcaption></figure><p>At the end of our time together on Saturday, we recorded our progress for the rest of the teams to watch during the final showcase. You can see our video here:</p><iframe src="https://cdn.embedly.com/widgets/media.html?src=https%3A%2F%2Fwww.youtube.com%2Fembed%2FkyAxGe1USMI%3Ffeature%3Doembed&amp;url=http%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3DkyAxGe1USMI&amp;image=https%3A%2F%2Fi.ytimg.com%2Fvi%2FkyAxGe1USMI%2Fhqdefault.jpg&amp;key=d04bfffea46d4aeda930ec88cc64b87c&amp;type=text%2Fhtml&amp;schema=youtube" width="854" height="480" frameborder="0" scrolling="no"><a href="https://medium.com/media/4f0a130e34c1370cbfcf95450fddea13/href">https://medium.com/media/4f0a130e34c1370cbfcf95450fddea13/href</a></iframe><h3>But the fun didn’t stop there.</h3><p>In the last week, we’ve added two different Shiny apps, built using R Studio and the data.world SDK, to our list of accomplishments.</p><p>The first shows donations to aldermen by year. Lining them all up in a graph starts to show the magnitude of contributions to one aldermen over another. The second, inspired by the Louisville hackathon team’s <a href="https://kristopherdelane.shinyapps.io/voting_record_kyga17/">voting record visualization</a>, makes it easy to search for a topic and see each alderman’s vote on the related legislation (like, say, Chicago’s 2016 Olympic bid):</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*jepbsmKC4JockT-iGk_UKQ.png" /></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*JR9-wWJVmYcie_r_E7Ml3g.png" /><figcaption><strong>Check out the interactive versions: </strong><a href="https://skirmer.shinyapps.io/chilobby/">Connections between aldermen and funders through lobbying →</a> and <a href="https://skirmer.shinyapps.io/chivotes/">Aldermanic voting records→</a></figcaption></figure><p>These dashboards were inspired by interesting tidbits we uncovered through some exploratory visualizations shared over Slack:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*yuS15JoBfajfAOyJDUSLJA.png" /></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*Hk6YerbyVDYUI1dcCQSvTQ.png" /><figcaption>Check out the queries that generated these charts: <a href="https://data.world/lilianhj/chicago-lobbyists/query/f474fec3-1d2b-4a8c-acff-d2ebeed6d905?columns=AMOUNT&amp;columns=recipient_surname&amp;tab=explore">contribution amounts by alderman</a> and count of <a href="https://data.world/lilianhj/chicago-lobbyists/query/5d10dfa3-0f3f-4c24-b414-3af4f9a5c9d4?columns=Alderman&amp;columns=Vote&amp;tab=explore">votes by alderman</a>.</figcaption></figure><p>…and we’re continuing to investigate the data, add new sources, and discuss our findings both <a href="https://data.world/stephen-hoover/chicago-city-council-votes/discuss/aldermanic-careers/10872">alongside the data</a> and in Slack:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*_8Qc1U0s4ehafG1BxeUUWA.png" /><figcaption>Ain’t no party like a D4D party, ’cause a D4D party don’t stop</figcaption></figure><p>One crucial discovery we’ve made since then that the lobbyist IDs are not unique to each client-lobbyist relationship after all, and we have redone our app to reflect this. While we can’t pinpoint the flow of dollars with as much granularity as we had hoped, the app still allows us to visualize how clients are connected to aldermen via lobbyists, and consider the broader web of interests at play. After all, a vital part of data science is being able to modify your position as the data — or your understanding of the data — changes.</p><h3>So, what’s next?</h3><p>After most hackathons, it’s rare to keep in touch with the people — or data — you met over the weekend. But, since the D4D community started remotely (sharing everything online over Slack<em>, </em><a href="https://github.com/data4democracy">Github</a>, and <a href="https://data.world/data4democracy">data.world</a><em>)</em>, we’ve already picked up where we left off!</p><p>Part of the magic of Data for Democracy is that “what’s next” is really up to each participant. Anyone can join (or create!) a new project and we’ve got <a href="http://datafordemocracy.org/projects.html">plenty to choose from</a>. So, join us on the <a href="http://datafordemocracy.org/contact.html">Data for Democracy Slack team</a>. We’ll be waiting for you in the #p-chicago-lobbying channel.</p><p><em>This article was written in collaboration with </em><a href="https://medium.com/u/4f2e2ed2de52"><em>Sharon Brener</em></a><em> and </em><a href="http://www.stephaniekirmer.com"><em>Stephanie Kirmer</em></a><em>, and originally posted on the </em><a href="https://blog.data.world/linking-lobbyists-and-chicago-legislators-a-collaboration-between-data-for-democracy-and-data-dbf30aeee70b"><em>data.world blog</em></a><em>. And of course, it wouldn’t be possible without the hard work of </em><a href="https://github.com/nathanielwroblewski"><em>Nate Wroblewski</em></a><em>, </em><a href="https://data.world/stephen-hoover"><em>Stephen Hoover</em></a><em>, and all the other D4D Chicago hackathon participants!</em></p><p><strong><em>This is the first installment in a series of posts highlighting the work done during D4D’s March 2017 hackathon. Check back in the coming weeks for more hackathon spotlight posts, as well as the conclusion to our Election Transparency post series!</em></strong></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=290e8435b0cb" width="1" height="1" alt=""><hr><p><a href="https://medium.com/data-for-democracy/hackathon-spotlight-1-linking-lobbyists-and-chicago-legislators-with-data-290e8435b0cb">Hackathon Spotlight #1: Linking lobbyists and Chicago legislators with data</a> was originally published in <a href="https://medium.com/data-for-democracy">Data for Democracy</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[The Data for Democracy Hackathon is Over!]]></title>
            <link>https://medium.com/data-for-democracy/the-data-for-democracy-hackathon-is-over-a634b018e19a?source=rss-b6160498d99d------2</link>
            <guid isPermaLink="false">https://medium.com/p/a634b018e19a</guid>
            <category><![CDATA[hackathons]]></category>
            <category><![CDATA[open-data]]></category>
            <category><![CDATA[civictech]]></category>
            <dc:creator><![CDATA[Lilian H]]></dc:creator>
            <pubDate>Mon, 03 Apr 2017 18:02:00 GMT</pubDate>
            <atom:updated>2017-04-03T18:56:53.114Z</atom:updated>
            <content:encoded><![CDATA[<h4>Or: We’ll Always Have Slack</h4><p>After a flurry of activity that brought in 220 new members and pushed us over 101,000 messages on Slack, the first Data for Democracy hackathon has ended! Thank you so much to everyone who helped to make this event a success by generously contributing time and energy, whether online or in person. The work to use data for social good still continues! Read on to find out more about the projects that took place this weekend, and how you can get involved.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/689/1*BiuWLL_-QifrHkndAaLMLw.jpeg" /></figure><h3>The People</h3><p>They say a picture is worth a thousand words, so here are some of our favorite photos from this weekend’s in-person meetups, where volunteers collaborated on data projects of local, national, and international scope — and, in many cases, met each other in person for the first time!</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*iL11bSgZTngeOBYNzBpkZw.jpeg" /><figcaption>Group discussions in Chicago</figcaption></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*gzJoDwt9LRm-nJEf2jO1Jg.jpeg" /><figcaption>Volunteer Stephanie describing an idea for tracking the flow of money between lobbyists and Chicago officials</figcaption></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*A8FVUw4JmbrTHEYP3yxrXQ.jpeg" /><figcaption>Boston’s Chief Data Officer, Andrew Therriault, at the Boston meetup</figcaption></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*RJpMer0OCOINs2uPO21q3g.jpeg" /><figcaption>DC volunteers discussing topics ranging from political advertising…</figcaption></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*z35xZD_H8RkyAEtSz-RVBA.jpeg" /><figcaption>…to discourse in online communities</figcaption></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*EeDNrKhBrFI6wNni_qhoUg.jpeg" /><figcaption>All fired up in Seattle</figcaption></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*M0-9GmzzcTYFpP--6ZKoJg.jpeg" /><figcaption>NYC volunteers figuring out how to restructure Zillow data to create city metrics for the USA Dashboard project</figcaption></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*PgOlXNHJ8n69eXDEXzORZg.jpeg" /><figcaption>NYC volunteers presenting a new project on improving accessibility</figcaption></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*YNoFinxH8sXH-UbrQG-ozw.jpeg" /><figcaption>Volunteer Robert kicking things off in Louisville</figcaption></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*1UN2ybhqzBQAVr1r01aSow.jpeg" /><figcaption>The Louisville volunteers received a visit from the Mayor himself!</figcaption></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/900/1*fneVI-wAo5u9iJntVG4TKQ.jpeg" /><figcaption>Volunteers in Austin, indeed making meaningful things</figcaption></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/900/1*p1v9GNi2oPSjnbrY9FMZjw.jpeg" /><figcaption>The littlest hacker in Austin, proving that D4D really is for everyone</figcaption></figure><h3>The Product</h3><p>In addition to the work that was done on existing D4D projects, some very exciting projects got underway this weekend. Here are just a few examples of what was done!</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*99KJJQT-lQacIFlvRTK2WA.png" /></figure><p>On Friday evening, longstanding D4D partner ProPublica (in conjunction with the New York Times and the Associated Press) posted financial disclosures for 180 White House staffers. D4D volunteers rose to the occasion immediately to convert these documents into structured and accessible data, available here on <a href="https://data.world/rflprr/d-4-d-hack-financial-disclosures">data.world</a>. They also created the above preliminary visualization to map the network of connections — people are rendered in red, and organizations are shown in green.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/700/1*uu1yxHeixGm31xa7bwMUSg.jpeg" /></figure><p>The Louisville team built an API of the 2017 Kentucky General Assembly, to make it easy for anyone in the state to look up the voting record of their legislators. The above is a first attempt at a visualization using this data — the Democrats who voted for GOP-sponsored bills.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*ji1jAXRAkLQxi52p6u88vw.jpeg" /></figure><p>Volunteers in NYC analyzed child adoption rates and created a decision matrix to explore how adoption is affected by race.</p><p>And there’s much, much more that took place this weekend — check out our Youtube <a href="https://www.youtube.com/playlist?list=PL1oL3vZOS_SKps9eOtiNU-iHiMC51DK4u">playlist</a> for recordings of our community showcase, where volunteers presented highlights from their work! We’ll also have a “Spotlight Series” on this blog in the coming weeks, featuring a more detailed look at several hackathon projects.</p><h3>The Potential</h3><p>So, what comes next?</p><p>If you couldn’t participate in this weekend’s hackathon, no worries! A weekend, as action-packed as it might be, is a short timeframe. Lots of these great projects have only just gotten started, and will be continuing past this weekend. We encourage you to <a href="mailto:team@datafordemocracy.org">join our Slack team</a>, catch up on what’s happened so far, and then jump in to help. This was our first (and hopefully not last) global hackathon, and any future similar events will be announced through Slack.</p><p>We’d also like to emphasize that Data for Democracy started out as an online community, and remains Internet-based first and foremost. No matter where you are, you are still very much welcome and encouraged to be involved. The bulk of the activity takes place on Slack, so you needn’t worry about missing out!</p><p>We enjoyed the pizza, the stickers (thanks, data.world!), and the great coverage in <a href="http://www.geekwire.com/2017/1200-volunteers-slack-saving-democracy-one-data-set-time/">GeekWire</a> and <a href="http://insiderlouisville.com/metro/accountability/louisville-to-participate-in-national-data-for-democracy-hackathon/">local media</a>. But ultimately, this hackathon was meant to provide an opportunity for a community of civic-minded data enthusiasts, spanning all levels of skill and experience, to come together and figure out ways to contribute to the public good through the use of data. Thank you all for making this happen, and for reminding us of what is possible when we work together and put our minds to it.</p><p>Let’s keep going.</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=a634b018e19a" width="1" height="1" alt=""><hr><p><a href="https://medium.com/data-for-democracy/the-data-for-democracy-hackathon-is-over-a634b018e19a">The Data for Democracy Hackathon is Over!</a> was originally published in <a href="https://medium.com/data-for-democracy">Data for Democracy</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[Data for Democracy Hackathon: Happening This Weekend!]]></title>
            <link>https://medium.com/data-for-democracy/data-for-democracy-hackathon-happening-this-weekend-d3d694c1d966?source=rss-b6160498d99d------2</link>
            <guid isPermaLink="false">https://medium.com/p/d3d694c1d966</guid>
            <category><![CDATA[hackathons]]></category>
            <category><![CDATA[open-data]]></category>
            <category><![CDATA[data]]></category>
            <category><![CDATA[data-science]]></category>
            <category><![CDATA[civictech]]></category>
            <dc:creator><![CDATA[Lilian H]]></dc:creator>
            <pubDate>Sun, 26 Mar 2017 18:02:01 GMT</pubDate>
            <atom:updated>2017-03-26T18:02:01.358Z</atom:updated>
            <content:encoded><![CDATA[<h4>Or: Who, What, When, Where, Why, and How?</h4><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*ugzFgxdPtS4LSsKvibclUQ.png" /></figure><h3><strong>What:</strong></h3><p>The first <strong>Data for Democracy Global Hackathon.</strong><br>Data for Democracy is a community of people using data to drive better decisions and improve the world in which we live. Join us for our first ever hackathon, where groups and individuals across the world will be working together on civic-oriented data projects.</p><h3>Who:</h3><p>The Data for Democracy community, including <strong>you</strong>!</p><h3>Why:</h3><p>This is a golden opportunity to apply your data and technology skills and interests for social good. Whether you’re new to the D4D community or have been an active member for months, and whether you’d like to jump on board with an existing project or get a new idea off the ground, you’re welcome to be a part of the hackathon!</p><h3>When:</h3><p>The hackathon will be from <strong>March 31 (Friday), 6 pm EDT</strong>, to <strong>April 2 (Sunday), 2 pm EDT</strong>.</p><h3>Where:</h3><p>If you can’t make it to an in-person meetup, no worries — a lot of activity will be happening <strong>remotely</strong>! Email us at <a href="mailto:team@datafordemocracy.org">team@datafordemocracy.org</a> to get an invitation to our Slack. Check out our projects on <a href="https://github.com/Data4Democracy/read-this-first">GitHub</a>, and some of our datasets on <a href="https://data.world/data4democracy">data.world</a>.<br>The <strong>in-person meetups</strong> are being organized in some major cities. Find the one nearest you in the following list!</p><p><strong>Austin, Texas:</strong><br>Saturday, April 1, 10 am — 4 pm CDT<br><em>Dev Bootcamp<br>1705 Guadalupe Street<br>Austin, TX 78701</em><br><a href="https://www.eventbrite.com/e/data4democracy-hackathon-austin-tickets-32887683948">Register here</a></p><p><strong>Boston, Massachusetts:</strong><br>Saturday, April 1, 9.30 am — 3.30 pm EDT<br><em>55 Massachusetts Avenue<br>Building 5, room 5–134<br>Cambridge, MA 02142</em><br><a href="https://www.eventbrite.com/e/data-for-democracy-hackathon-boston-tickets-33088552752">Register here</a></p><p><strong>Chicago, Illinois:</strong><br>Friday, March 31, 5 pm — 10 pm CDT<br>Saturday, April 1, 10 am — 2 pm CDT<br><em>Metis Chicago<br>1033 West Van Buren St, 3rd Floor<br>Chicago, IL 60607</em><br><a href="https://www.meetup.com/Metis-Chicago-Data-Science/events/238326371/">Register here</a></p><p><strong>Louisville, Kentucky:</strong><br>Friday, March 31, 6 pm — Saturday, April 1, 6 pm EDT<br><em>LouieLab<br>745 West Main Street<br>Louisville, KY 40202</em><br><a href="https://www.eventbrite.com/e/data-for-democracy-louisville-hack-a-thon-tickets-33048900150">Register here</a></p><p><strong>New York City, New York:</strong><br>Saturday, April 1, 9 am — 9 pm EDT<br><em>NYU Center for Data Science<br>60 5th Ave<br>New York, NY 10011</em><br><a href="https://www.eventbrite.com/e/data-for-democracy-hackathon-nyc-tickets-33021075927">Register here</a></p><p><strong>Seattle, Washington:</strong><br>Friday, March 31, 6.30 pm — 10 pm PDT<br>Saturday, April 1, 8 am — 10 pm PDT<br>Sunday, April 2, 8 am — 5 pm PDT<br><em>Ada’s Technical Books<br>425 15th Avenue East<br>Seattle, WA 98112</em><br><a href="https://www.eventbrite.com/e/data-for-democracy-hackathon-seattle-tickets-32966017245">Register here</a></p><p><strong>Washington DC:</strong><br>Saturday, April 1, 10 am — 4 pm EDT<br><em>WeWork Crystal City<br>2221 South Clark Street<br>Arlington, VA 22202</em><br><a href="https://www.eventbrite.com/e/data4democracy-hackathon-dc-tickets-32883442261">Register here</a></p><h3>How:</h3><p>We’d like to thank our sponsors and supporters who are helping to make this possible.</p><ul><li><a href="http://www.seattletechnicalbooks.com/"><strong>Ada’s Technical Books</strong></a> is a Seattle bookstore that seeks to open the doors to technology and science as widely as possible, and provide a space for the tech community to gather.</li><li><a href="http://www.civicdataalliance.org/"><strong>The Civic Data Alliance</strong></a> is Louisville’s volunteer Code for America Brigade, and advocates for open data, hackathons, and civic engagement.</li><li><a href="https://data.world/"><strong>data.world</strong></a> is a social network for data people. It aims to provide a platform for data scientists to discover and share cool data, connect with others who share their interests, and work together to solve problems faster.</li><li><a href="https://devbootcamp.com/"><strong>Dev Bootcamp</strong></a> is an immersive coding bootcamp with locations in Austin, Chicago, New York, San Diego, San Francisco, and Seattle.</li><li><strong>LouieLab</strong> is a civic innovation hub and coworking space, designed to promote and build smart city projects through collaboration between private companies and the city of Louisville.</li><li><a href="https://www.thisismetis.com/"><strong>Metis</strong></a> provides full-time immersive bootcamps, evening part-time professional development courses, online resources, and corporate programs for data scientists in Chicago, New York, San Francisco, and Seattle.</li><li><a href="http://acses.mit.edu/"><strong>The MIT Association of Computational Science and Engineering Students</strong></a><strong> </strong>is a social organization for graduate students within the Center for Computational Engineering at MIT.</li><li><a href="http://cds.nyu.edu/"><strong>The NYU Center for Data Science</strong></a> aims to create the country’s leading data science training and research facilities, and equip researchers and professionals with tools to harness the power of big data.</li><li><a href="https://www.redoakstrategic.com/"><strong>Red Oak Strategic</strong></a> provides full-stack data science consulting services, including data engineering, predictive analytics, visualizations and data-driven strategies in the finance, corporate, political and other verticals.</li></ul><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=d3d694c1d966" width="1" height="1" alt=""><hr><p><a href="https://medium.com/data-for-democracy/data-for-democracy-hackathon-happening-this-weekend-d3d694c1d966">Data for Democracy Hackathon: Happening This Weekend!</a> was originally published in <a href="https://medium.com/data-for-democracy">Data for Democracy</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[D4D at the Three-Month Mark]]></title>
            <link>https://medium.com/data-for-democracy/d4d-at-the-three-month-mark-4f2ac02c082f?source=rss-b6160498d99d------2</link>
            <guid isPermaLink="false">https://medium.com/p/4f2ac02c082f</guid>
            <category><![CDATA[hackathons]]></category>
            <category><![CDATA[data-science]]></category>
            <category><![CDATA[civictech]]></category>
            <dc:creator><![CDATA[Lilian H]]></dc:creator>
            <pubDate>Tue, 14 Mar 2017 17:02:01 GMT</pubDate>
            <atom:updated>2017-03-23T03:43:07.024Z</atom:updated>
            <content:encoded><![CDATA[<h4>Or: Hackathons, Competitions, and Press Coverage, Oh My</h4><p>So much has been going on since <a href="https://medium.com/data-for-democracy/the-first-two-months-of-d4d-da732998ec28#.4wpab4rkj">last month’s check-in</a> that it’s hard to know how to put it into words. So let’s start with some numbers!</p><p>Data for Democracy has reached:</p><blockquote>1038 members</blockquote><blockquote>More than 1430 GitHub commits, across 37 GitHub repos</blockquote><blockquote>Over 75000 Slack messages</blockquote><p>That’s right — we’ve passed the <a href="https://medium.com/data-for-democracy/our-first-1-000-volunteers-57c59af15547#.zdbqpoe0y">1000-volunteer mark</a> and are still going! You can be one of the next thousand, just by emailing <a href="mailto:team@datafordemocracy.org">team@datafordemocracy.org</a>. Read on for some exciting things and upcoming events you can be a part of, if you join the D4D community.</p><h3>Project Updates</h3><p>Here’s what some of our active projects have been up to in the past month, and where and how you can jump in.</p><h4>Assemble</h4><p>The <a href="https://github.com/Data4Democracy/assemble">Assemble</a> team, which is developing tools for researchers to use to study online communities, has been focusing on developing and deploying approaches to community detection. Within the next few weeks, they hope to put together a cohesive approach to community detection that can be easily picked up and used by researchers.</p><p>The team chose to turn their sights to this because they view community detection as a vital component of several larger buckets of intriguing research.</p><ul><li><strong>Information diffusion and social contagion:</strong> Namely, how information moves into and across a community. The past decade or so has seen a veritable explosion of research regarding how information flow happens in online social networks, and the team considers it important to understand how this happens in “far-right” and related communities.</li><li><strong>Bot detection:</strong> An important aspect of studying online social networks is the detection of digital personas. Why are they there? What are their characteristics? There are a great many avenues of research available for pursuit in this area.</li><li><strong>Language usage across communities:</strong> There are opportunities to build on recent research to predict changes in community composition based on their language use — for example, by examining the idioms, euphemisms, and memes that the community uses to communicate, and the graph properties of how this language is structured.</li></ul><p>A team member, Henri, has made major headway in developing and releasing the preliminary version of a community detection algorithm. His work has been constantly supported by his other teammates who are working on data collection and curation across dozens of online social network platforms. It’s still in its early stages, but initial findings are promising, and the team has high hopes that this will evolve into a one-stop solution for community detection and bot detection.</p><p><strong><em>How You Can Get Involved:</em></strong></p><p>There’s still a lot more work to be done with curating source datasets and developing algorithms, and the team would love more fresh perspectives.</p><p>If you’re interested in novel approaches to community detection in online social networks, consider jumping on board at the #assemble Slack channel! This is a golden opportunity to study some emerging communities and how social contagion works across the “far-right”.</p><h4>Drug Spending</h4><p>The <a href="https://github.com/Data4Democracy/drug-spending">Drug Spending</a> team is currently neck-deep in data acquisition, consolidation, and wrangling. Information on drugs and drug spending comes in a whole range of disparate datasets from multiple organizations, with little universal information to cross-match between them. In order to dive further into the available datasets, the team first needs to make sure they exist side-by-side in a cleaned and easily accessible form.</p><p>In line with this, the team has mainly been focused on acquiring and cleaning data sources. They are working hard at data consolidation — building indices and references between datasets, such that they can easily be associated with each other.</p><p>They have also ventured into making initial visualizations and exploratory analyses of these datasets. One volunteer, Stephanie, has been building an R Shiny <a href="https://skirmer.shinyapps.io/firstapp/">app</a> to visualize manufacturers and the drugs they produce, based on Medicaid and Medicare data. Another volunteer, David, is using a related dataset to <a href="https://github.com/Data4Democracy/drug-spending/blob/master/python/notebooks/drugs_w_lrg_yr-yr_increases/Year_Over_Year_Increases_Part_D.ipynb">visualize</a> year-over-year increases in drug spending.</p><p><strong><em>How You Can Get Involved:</em></strong></p><p>The team is currently grappling with drug classification, and trying to develop a framework to associate drug names with their general therapeutic uses at a large scale. The project could greatly benefit from the input of contributors who have domain knowledge and experience regarding pharmaceuticals and health policy. If you fit the bill, your help would be very welcome! Check out the <a href="https://data.world/data4democracy/drug-spending">datasets</a> and the #drug-spending Slack channel.</p><h4>Election Transparency</h4><p>The Election Transparency team has been busy over the last few weeks, working on collecting, normalizing, and standardizing historical data on county-level electoral outcomes. This effort continues to pose new challenges, as the format of the data varies widely — some states provide Excel spreadsheets, while others only make data available in PDFs. The team continues to work through these hurdles and find creative ways to build out their (<a href="http://data.world/data4democracy/election-transparency">publicly available!</a>) dataset.</p><p>They have also started to build models and create visualizations, leveraging their newly cleaned data to help explain the outcome of the 2016 U.S. presidential election. One contributor, Robert, drew on the dataset to build a Shiny <a href="https://rkahne.shinyapps.io/presidential_vistualization/">app</a> that displays county-level Presidential election results back to the year 2000.</p><p>Moving forward, the team is starting to collect data and shapefiles to explore the impact of redistricting and gerrymandering on electoral outcomes. Through partnering with the <a href="http://www.openelections.net/">OpenElections Project</a> to build precinct-level data for all statewide races, the team hopes to address questions including:</p><ul><li>What are appropriate measures of a fairly-drawn district?</li><li>How much do current district boundaries deviate from a fairly-drawn district?</li><li>How well do current districts represent the demographics of the country?</li></ul><p>Stay tuned for more detailed posts about the Election Transparency project in the coming weeks! These will delve into the nitty-gritty of the data collection process, and the key findings that have been made so far.</p><p><strong><em>How You Can Get Involved:</em></strong></p><p>The team is always interested in additional projects that will further their goal: to make the electoral system more transparent and easier to understand for everyone. If this is of interest to you, check out the project <a href="https://github.com/Data4Democracy/election-transparency">details</a> and join the #election-transparency Slack channel, and let them know if you have an idea!</p><h4>Internal Displacement</h4><p>Based on a challenge set by the Internal Displacement Monitoring Centre (IDMC), the D4D <a href="https://github.com/Data4Democracy/internal-displacement">Internal Displacement</a> team is building a tool to populate a database with information about displacement events, which can then be used by both machine and human analysts.</p><p>So far, most of the team’s efforts have been focused on building a Python back-end to scrape, classify, and extract information from articles in IDMC-provided datasets. Information retrieval is proving to be an interesting challenge, in part due to the complexities involved with natural language processing. The team has recently succeeded in finalizing the database schema for the project, and is now shifting gears to building the front-end app.</p><p><strong><em>How You Can Get Involved:</em></strong></p><p>The team is looking for volunteers who are interested in helping to build the front-end app to visualize and interact with the database.</p><p>In addition, the team will continue their work to further refine the back-end code, work on the tricky issue of natural language processing, and implement online machine learning for new documents. It’s a good time to join the #internal-displacement Slack channel if you’re interested in any of these areas!</p><h4>ProPublica</h4><p>So far, the collaboration between ProPublica and D4D has centered on two main threads: data relating to <a href="https://github.com/Data4Democracy/official-foreign-travel">official foreign travel</a> of elected representatives, and data on <a href="https://github.com/Data4Democracy/house_expenditures">House Expenditure reports</a>. The team’s work has largely focused on loading and cleaning these datasets to standardize them, remove duplicates, and wrangle the data into a convenient format for analysis.</p><p>The foreign travel dataset is based on the House Official Foreign Travel reports, published quarterly by the House Clerk. ProPublica hopes to eventually use this dataset to track how official foreign travel expenditure has changed over time, and in particular whether and how it is influenced by political and international events.</p><p>The House Expenditure project was one of several undertakings <a href="https://www.propublica.org/article/taking-cues-and-some-projects-from-sunlight-labs">adopted by ProPublica</a> in 2016, after the closure of Sunlight Labs — an open source community run by the Sunlight Foundation, which sought to use data to increase government transparency and accountability. Through working with D4D on <a href="https://projects.propublica.org/represent/expenditures">this dataset</a>, ProPublica aims to detect unusual variances in spending by lawmakers. ProPublica also hopes to eventually <a href="https://www.propublica.org/nerds/item/sunlight-labs-takeover-update">add a search interface</a> and make the data available for download.</p><p><strong><em>How You Can Get Involved:</em></strong></p><p>If you have any interest in investigating lawmaker activity and ensuring accountability, ProPublica is looking for more volunteers. Feel free to browse the <a href="https://data.world/data4democracy/propublica">data</a> and join the #propublica Slack channel if it piques your interest!</p><h4>USA Dashboard</h4><p>The <a href="https://github.com/Data4Democracy/usa-dashboard/issues">USA Dashboard</a> team, which is creating a dashboard that will display key metrics for various regions of the USA, has been moving data into PostgreSQL; the D4D community can now access the data through Mode, and carry out reports and preliminary analysis there. The team is currently working on defining data documentation and writing dictionaries, in order to facilitate more targeted analysis.</p><p><strong><em>How You Can Get Involved:</em></strong></p><p>The team is seeking domain experts who can provide input on how to count crime reports to make a fair comparison across cities.</p><p>As the project broadens its focus to explore new metrics, including economy, poverty, and healthcare, the team would also like to recruit domain experts in these areas, who can help to develop and frame research questions based on the available data.</p><p>If you’d like to be part of this effort, drop the team a line at the #usa-dashboard Slack channel.</p><h3>D4D in the News + KDNuggets competition</h3><p>Besides all the activity going on internally, D4D has also received some great press coverage elsewhere!</p><p>KDNuggets <a href="http://www.kdnuggets.com/2017/02/data-democracy-d4d.html">reposted</a> last month’s blog update, and we received a shout-out in a <a href="https://techcrunch.com/2017/02/21/data-world-raises-18-7-million-to-remedy-our-post-fact-society/">TechCrunch</a> feature about Data.World, one of D4D’s most enthusiastic partners and supporters.</p><p>We’ve also sponsored a “Data Science vs Fake News” <a href="http://www.kdnuggets.com/2017/02/data-science-vs-fake-news-contest.html">contest</a>, in collaboration with KDNuggets and Data.World. The deadline for submissions was on March 10, but if you missed it, no worries — there’s another upcoming opportunity to show off your data science chops!</p><h3>D4D Hackathon</h3><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*vOXzFcjlNYoCbiBO." /><figcaption><em>Banner by volunteer Justin</em></figcaption></figure><p>We’re excited to announce the very first <strong>Data For Democracy Global Hackathon</strong>! More information about this event will be given in an upcoming post, but here are the key facts:</p><blockquote><strong>FROM:</strong> March 31st, 2017, 6 pm EDT<br><strong>TO:</strong> April 2nd, 2017, 2 pm EDT</blockquote><p><strong>Anyone can participate</strong>, simply by signing up as a D4D member (again, shoot an email to <a href="mailto:team@datafordemocracy.org">team@datafordemocracy.org</a> and we’ll get you sorted out)!</p><p>Since this is a global group, many people will be working remotely, using tools such as Google Hangouts, Slack, and GitHub. In-person meetups are also being organized in some major cities; again, more details will be coming nearer the date. Feel free to arrange a meetup of your own!</p><p>The Hackathon will conclude with a showcase on April 2nd, 2 pm — 3 pm EDT, where the D4D community will display and demonstrate what they built during the Hackathon, in a series of short presentations.</p><p>If you have a cool project idea you’d like to get off the ground, or want to get involved with D4D but don’t know where to start, this weekend will be an excellent opportunity to dive straight in. Don’t worry about not knowing anyone — by the end of the Hackathon, you will.</p><p>Keep an eye out for more news in the coming weeks!</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=4f2ac02c082f" width="1" height="1" alt=""><hr><p><a href="https://medium.com/data-for-democracy/d4d-at-the-three-month-mark-4f2ac02c082f">D4D at the Three-Month Mark</a> was originally published in <a href="https://medium.com/data-for-democracy">Data for Democracy</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[The First Two Months of D4D]]></title>
            <link>https://medium.com/data-for-democracy/the-first-two-months-of-d4d-da732998ec28?source=rss-b6160498d99d------2</link>
            <guid isPermaLink="false">https://medium.com/p/da732998ec28</guid>
            <category><![CDATA[data]]></category>
            <category><![CDATA[github]]></category>
            <category><![CDATA[open-data]]></category>
            <category><![CDATA[government]]></category>
            <category><![CDATA[civictech]]></category>
            <dc:creator><![CDATA[Lilian H]]></dc:creator>
            <pubDate>Tue, 14 Feb 2017 18:02:01 GMT</pubDate>
            <atom:updated>2017-02-14T18:02:01.365Z</atom:updated>
            <content:encoded><![CDATA[<h4>Or: Where Do We Come From? What Are We? Where Are We Going?</h4><p>Since its inception in December 2016, the small community of Data For Democracy volunteers has grown into a network of over 700 people, spanning a range of locations, timezones, and backgrounds — as you can see on our <a href="http://datafordemocracy.org/about.html">brand new website</a>.</p><p>This group of passionate and civic-minded people is applying a diverse set of skills and knowledge to an equally varied selection of projects, and has made remarkable progress in the past two months! Here are some highlights of what we’ve been up to.</p><h3>Assemble</h3><p>The Assemble project is working to develop a toolkit and technological infrastructure that researchers can use to study online communities and their characteristics. With Ben and Nick at the helm, this scrappy team has rocketed past various milestones including:</p><ul><li>Setting up a streaming data pipeline and database, with the generous assistance of our friends at <a href="http://eventador.io/">Eventador</a>. <a href="https://modeanalytics.com/">Mode Analytics</a> has also donated their platform for D4D use, meaning that Assemble’s social media data, in Mode, will soon be coming to you via Eventador!</li></ul><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*pepWYzQVQLchWNOX." /><figcaption><em>Architecture diagram by Assemble contributor Ahmad</em></figcaption></figure><ul><li>Holding a weekend hackathon to develop and refine Twitter data collection capabilities.</li><li>Starting work on a beginner-friendly scraping project, with the first milestone being to collect the 2017 congressional record.</li></ul><h3>Drug Spending</h3><p>This team, headed by the intrepid Matt and Jennifer, is focused on researching where and how Medicare tax dollars are being spent, and presenting these findings in clear and accessible ways. They’ve recently started constructing a Shiny dashboard that will make these visualizations much easier.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*q9FB92bMsshM3jD2." /><figcaption><em>Drug Spending dashboard for metformin</em></figcaption></figure><h3>USA Dashboard</h3><p>This project is working on creating a dashboard that will display key metrics for various regions of the USA. They’ve gathered crime data for various cities including Chicago, Washington DC, New York City, and Philadelphia, and have developed a <a href="https://github.com/Data4Democracy/usa-dashboard">roadmap</a> for where the project is headed next, both literally and figuratively! You can also hear project lead Sean <a href="http://partiallyderivative.com/podcast/2017/02/07/frogger-will-steal-your-jobs">discuss</a> this work on the Partially Derivative podcast.</p><h3>Election Transparency</h3><p>Led by Scott, Chris, and Rachel, the Election Transparency project works on collecting and normalizing county-level election results, to be shared with the public. The team has put together extensive datasets of election results and population demographics, and these are all published and <a href="https://data.world/data4democracy/election-transparency">available for browsing</a> thanks to the support of our friends at <a href="https://data.world/">Data.World</a>. Moving forward, the team plans to create various models and visualizations that will help to explain election outcomes.</p><h3>ProPublica</h3><p>This team carries out data analysis to support the work done by the non-profit investigative journalism organization <a href="http://propublica.org">ProPublica</a>. With the fearless leadership of Eric and Ryan, the team is investigating campaign spending expenditures, and has recently developed a nifty text-scraper to navigate some messy reports of government officials’ foreign travel.</p><h3>What’s Next?</h3><blockquote><em>“Is D4Ding a verb? Because that’s what I’m doing all weekend.” — Ben</em></blockquote><p>This has just been a small sampling of the many projects Data For Democracy is currently tackling. Even more new projects are always taking shape — whether initiated by our own community members, or taken on through partnerships with other organizations. In particular, we have some exciting collaborations brewing with the Cities of San Diego, Los Angeles, and New York.</p><p>If one of our projects piques your interest, or you’d like to propose an idea of your own, come join us! There are all sorts of ways to get involved, no matter your level of commitment, skill, or experience.</p><p>If you’d like more detail on possible ways to jump in, and why you should be a part of this, stay tuned for upcoming posts!</p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=da732998ec28" width="1" height="1" alt=""><hr><p><a href="https://medium.com/data-for-democracy/the-first-two-months-of-d4d-da732998ec28">The First Two Months of D4D</a> was originally published in <a href="https://medium.com/data-for-democracy">Data for Democracy</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
    </channel>
</rss>