Cybernetics observations of human/machine interactions during World War 2 computer-assisted anti-aircraft guns inspired by / enabled systems analysis from Greek kybernetike for "governance" feedback loops: outputs connected to inputs the foundational meetings were the Macy Conferences
10th Macy Conference group photo started in 1946, in New York City early Computer Scientists (there were about 3 digital computers in existence) John von Neumann, Norbert Wiener, Warren McCullouch, JCR Licklider, and many others psychologists from the Gestalt school Kurt Lewin, Wolfgang Köhler "the whole is different from the sum of its parts" attracted to systems/feedback loops
John von Neumann the original hacker? simulated nuclear explosions in WW2 wrote code for ENIAC, UNIVAC, whatever no programming language had been invented yet pioneer of big data von Neumann computer architecture very active in academic society at the right place at the right time repeatedly Influenced: Mandelbrot, Feynman, Wolfram
Kurt Lewin sometimes called the father of Social Psychology Field Theory: \(B = f(P, S)\) behaviour is a function of the person and their situation people respond differently to different situations our perceptions are part of the situation Lewin presented Field Theory at the 2nd Macy Conference. Influenced: Köhler, Bandura, Mischel, Shioda
Reaching hands Cybernetic movement rapidly collapsed Artificial Intelligence partly took up the cause Computer Science sortof dominated Psychology, despite studying mind and brain, remained at a distance Social Psychology and Personality Psychology split over Lewin's Theory possibly for lack of a computational, analytic framework? the students of von Neumann never connected with the students of Lewin
vCard PhD Candidate, Social Psychology Psychology Department Studying: Social Complexity and Collective Intelligence agent-based modelling experimental psychology social network analysis memes Academic background BS Cognitive Science, Carnegie Mellon University Research Analyst, Berkeley MA Psychology, University of Toronto http://imiller.utsc.utoronto.ca
Voûte de l'église Saint-Séverin à Paris buffet sample pack back-to-back "lightning talks" questions please ask for clarification please save theoretical questions until the end
Itinerary by Matthew Paris ca. 1250-1259 social psychology meets artificial intelligence pub2: self publishing memelab: meme sharing topoli: twitter political memes rofo: Rob Ford social network urban legends: memes and network topologies gh-impact: influence in open source software election-memes: 2016 US election memes pplapi: human simulacrum as a service
10th Macy Conference group photo Lewin and von Neumann meet at Macy 1 and 2: March 1946 October 1946 consider that moment in time as a starting point imagine spidering their social networks social network with one particular constraint: authors must be linked via co-authorship
some example BibTeX database of citations BibTeX is the interchange format read/write citations compatible with everything R Python LaTeX ugly, but good enough my Zotero library consists of \(n \approx\) 2500 citations autoexport plugin syncs a .bib file
identify the largest component \(n_{authors} = 1574\) \(n_{edges} = 4905\) look at 2nd-largest component for authors who are missing a link library is curated, not randomly sampled calculate modularity (stochastic) to identify communities \(n_{communities} = 34\)
The missing link? situated myself in the citation graph citation data "in the wild" are very poor quality the communities (detected by modularity) seem plausible some communities resisted collaboration Royal Society US and UK WWII computer scientists CS and Psychology sides of AI von Neumann is very distantly connected to Lewin's students Lewin's death may have dashed any chances of a collaboration
an article publication in 2016, that is where do you publish? where do you find articles? pub2: a system for self-publishing all you need is a PDF and a .bib file that links to it
Hoe's one cylidner printing press books conferences proceedings presentations journals science-wide discipline-specific pre-publication: arXiv SSRN fooXiv self-publication??
Bibliothek St. Florian citeseer/citeulike/worldcat pubmed/government portal publisher search (ha!) open journals university or public library Google scholar warning: you will be rate-limited
Zines goal: get BibTeX indexed by Google Scholar digital object identifier? serve it yourself. Google don't care archival format HTML? PDF pub2 is a system for making this realistic
pub2 example source like Jekyll (static site generator) for PDFs generates PDF and .bib file that links to PDF simple LaTeX isn't so different from Markdown YAML preamble/templating like Jekyll actually sits on top of an existing Jekyll install more info pip install pub2 http://pub2.readthedocs.io
The meme that started it all this is where my research started in 2011 Dawkins (1976): cultural replicator viral: in epidemic modelling, when there is faster infection than recovery image macro: the style of meme consisting of a background picture and inset captions memes have been used in psychological research for decades “a method in which greeting cards are used to examine how parents communicate with their children” (Cacioppo & Andersen, 1981)
Meme choice screen on meme8.com online behavioural laboratory software on a web server clone of other meme creator websites (ecology) formerly, was only at http://meme8.com
Sharing screen on meme8.com participants use memelab to create sharable memes in the lab (after survey) memelab hosts the images with a unique URL any time an image is viewed by an online user, this is logged as a “hit.”
Microscope participants UTSC undergrads n=118 participants each participant created 2 memes (total = 236) 50% had created memes before study Demo pick a background picture put words at the top and bottom of the picture use sharing interface so friends can see meme
This is a linear model displayed graphically model incorporates features of the content boolean values subject: the meme is academic or not (in this study: animal) language: the meme caption includes self-reference words or not model also incorporates ratings meme is funny personally meaningful
Regression function SIR in Agent-Based Modelling Susceptible Infected Recovered use slopes from linear model. use gaussian noise to create a distribution of random memes. threshold (0.5) makes the function into a "trigger" that can lead to infection.
Simulation zoomed detail At each time “tick” the experiment wanders the space and recruits any agents who happen to be in the vicinity. Recruited agents create a meme (based on gaussian noise) and those agents share with a probability determined by the regression function. Infections signified by agent remarking, “Seen it.” An edge arrow tracks infection.
Geometric roof behavioural data from participants model of intention to share "ported" sharing model to an agent-based model run simulation and obtained similar results
Toronto City Hall a hash tag is used to channel tweets towards interested audiences topoli = toronto politics #topoli used to discuss municipal politics related to Toronto. analogous hashtags #onpoli and #capoli
The bird observed 1,276,077 tweets between Nov 2013 - January 2015 832,889 imported so far 474,831 original (non-retweet) tweets 82,874 agents have been observed tweeting
topoli participants are in Toronto of the 82,874 participating, 52,517 (63%) reported some location information of the 52,517 who reported something: 15,752 (30%) included “toronto” in location 8,420 (16%) included “canada” in location but the GPS speaks for itself
panning for gold frequency minus “stopwords” looking forward to mayor of toronto toronto mayoral race city of toronto this is the pointwise mutual information (PMI) the gravy train the island airport little red apples a speedy recovery the shirtless jogger
panning for gold only original tweets remove retweets (which inflate n-gram frequency) PMI is like chi-square how often did we observe the n-gram versus expected number of observations results: the n-grams that are co-original (not duplicates)
Well-connected Accounts I chose a random walk-based approach pick a random person from the network randomly pick someone they follow; repeat if people are connected within n steps, they are in a community run this thousands of times
Calculator meme content is predictive of retweets knowing an individual’s group provides a better prediction of meme propagation than meme content alone better communities may yield better model fit evidence that individuals differ in their unique abilities to harness memes towards retweets
Geometric roof social network analysis serves to organize a huge amount of social data potential memes can be identified by extracting n-grams from original tweets, and it explains some variance in retweets “cliques” can be identified according to who is more closely connected with whom each meme is unique, and individuals differ in their ability to successfully tweet a meme.
Ford at Windsor Rd this picture surfaced in the wake of the Rob Ford crack video “who are these guys?” every kid in the picture has been the victim of a murder attempt (one successful) just the start of Rob Ford’s shady connections
Lisi ITO the police happened to be collecting data for us (names, places, times, etc) a golden opportunity to explore social network analysis dossier: people locations associations
Network detail custom online viewer unpacks the social network from the coded data d3.js Python Flask-Diamond backend Fruchterman-Reingold layout places connected people closer; repels distant people locations attract connected people as well
Geometric roof the original police reports are not as expressive as the network browser The rofo project permits the visual exploration of narratives correct social network layout algorithm facilitates the narrative we can’t conclude anything about the people involved …but sometimes there aren't too many hops!
Flammarion urban legends are subject to emotional selection (Heath et al, 2001) disgust, a high arousal negative emotion, was shown to predict sharing likelihood (Eriksson and Coultas, 2014) replicate Eriksson and Coultas, 2014 with a simulation explore scenarios beyond what was tested in the lab
Replicate lab topology serial transmission paradigm participants are linked into chains topology 160 agents 40 transmission chains 4 agents per chain 3 time steps disgusting legends are shared so infrequently that in practice, the 4th participants rarely receives anything. follow-up study: does it help to have 100x more agents? no.
Replicate lab topology connected agents via preferential attachment after 3 time steps, expected "viral" outcomes were obtained agents in the 3rd step were finally receiving a good portion of memes explored other network sizes \(n=800\) and \(n=16000\) with more agents and more possible connections, even disgusting legends stay alive
Geometric roof lab studies can provide valuable models of behaviour even if the lab study is implausible even if it does not perform "correctly" out of context models can be transplanted from the literature into agent-based simulations a model can be "ported" from an unfavourable context to a more plausible environment
Percent of accounts belonging to Organizations What is gh-impact? Where does gh-impact come from? Who does gh-impact apply to? Why does gh-impact matter? How can I use gh-impact?
Top-10 gh-impact is a measure of influence on GitHub. Accounts that publish lots of popular projects will have higher gh-impact scores. captures breadth and depth of project use on GitHub gh-impact score is n if there are n projects with n stars ex: 1, 1, 1 = 1 ex: 1, 2, 3 = 2 "gh" stands for "good & hot"
99th Percentile Organizations related to academic citation analysis ("bibliometrics") GitHub public data API GHTorrent data dumps own custom statistics pipeline
Individuals vs Organizations all accounts on GitHub individuals and organizations must have at least one star to have a gh-impact score not just software: websites, curation, science …
Organizations can scale up assess productive output of accounts get credit for open source work (e.g. academics) gh-impact is a key metric for developing other analyses intelligence about open source software
in Collective Intelligence proceedings search: gh-impact scores for over 1.1 million accounts analysis: results from ongoing statistical analysis are regularly posted open data: download raw JSON from GitHub http://www.gh-impact.com
The future, figuratively gh-impact is a measure of influence on GitHub similar to how influence is measured in academia describes both individual and organization GitHub accounts gain intelligence about open source software search, analysis, and data on http://www.gh-impact.com
Vote we are approaching US election day 4 platforms: twitter instagram imgur facebook about 38,000 memes collected by a meme aggregator called Sizzle interesting data: total number of likes
Earth seen from Apollo 17 this can only go so far eventually, you hit 7.2 billion for Humans, there is no larger sample for Social Psychologists, there is but one population \(n \approx 7.2B\)
Selecting one agent at random are we living inside a simulation? Moravec, Bostrom, and others raise the possibility I have set out to demonstrate that a simulacrum may be created …so long as we can tolerate some error the result is pplapi.com web service \(n \approx 7.2B\)
Results are available as JSON enables agent based modelling in varying social and geopolitical contexts Supports: netlogo MASON R LISP (so: ACT-R, others) an analogy for netlogo users Agent : AgentSpace :: Behavior : BehaviorSpace
Age in US: Census and Simulation clearly not fully accurate in the ballpark partly constrained by data implications for identity, privacy data are simulated, not real however, imagine they were real …and isn't that likely, given the rate of information disclosures?
Geometric roof Simulacrum as a Service certain caveats apply plugs in with many existing modelling frameworks examples in several languages: netlogo, mason, R you can actually experiment with pplapi.com rate limited, but otherwise open actively using pplapi on several social projects http://pplapi.com
Serpiente alquimica / Ouroboros What if Lewin's ideas had found a home among AI researchers? Recall Field Theory for one person: \(B = f(P, S)\) behaviour is a function of the person and the situation I can derive model functions from empirical data or from the literature I have the set of all people with pplapi I can calculate behaviour as a property of my simulation situation changes as in response to behaviour, so this is also a property of the simulation altogether, this is a new justification for computational social psychology
Itinerary by Matthew Paris ca. 1250-1259 social psychology meets artificial intelligence pub2: self publishing memelab: meme sharing topoli: twitter political memes rofo: Rob Ford social network urban legends: memes and network topologies gh-impact: influence in open source software election-memes: 2016 US election memes pplapi: human simulacrum as a service
the end a wide range of social phenomena can be explored with agent-based modelling networks are a powerful too for organizing and utilizing relationships memes are an excellent unit of analysis. simultaneously: a coherent concept that is meaningful to humans a chunk of data that can be quantified and tracked Field Theory was the conceptual bridge between these disciplines My current work is called Iterated Field Theory
vCard thank you for the opportunity to present this presentation is online http://iandennismiller.github.io/net-complex-intel contact information http://imiller.utsc.utoronto.ca