Below I provide a list of references that have inspired the Graph Workflow model, and are referenced throughout www.graphworkflow.com.
Journal articles
- Anscombe F. J. (1973). “Graphs in Statistical Analysis”. American Statistician. 27 (1): 17–21.
- Box G. and D. Cox (1964), “An analysis of transformations”. Journal of the Royal Statistical Society B, 26: 211-252.
- Chapman L. (1967). “Illusory correlation in observational report”. Journal of Verbal Learning and Verbal Behavior. 6 (1): 151–155.
- Chatterjee, Sangit; Firat, Aykut (2007). “Generating Data with Identical Statistics but Dissimilar Graphics: A Follow up to the Anscombe Dataset”. American Statistician. 61 (3): 248–254.
- Chernoff H. and Lehmann E. (1954), “The Use of Maximum Likelihood Estimates in χ2 Tests for Goodness of Fit”, Annals of Mathematical Statistics, 25(3):579-586.
- Christodoulou D. (2017), “”Heuristic criteria for selecting an optimal aspect ratio in a two-variable line plot”, The Stata Journal, 17(2):279-313.
- Cleveland W. (1979), Robust locally weighted regression and smoothing scatterplots, Journal of the American Statistical Association 74: 829–836.
- Cleveland W., Devlin S. (1988), “”Locally Weighted Regression: An Approach to Regression Analysis by Local Fitting”, Journal of the American Statistical Association,83(403):596-610.
- Cleveland W., Devlin S. and Grosse E. (1988), “Regression by Local Fitting: Methods, Properties, and Computational Algorithms”, Journal of Econometrics, 37:87-114.
- Cleveland W. and McGill R. (1984), “Graphical perception: Theory, experimentation, and application to the development of graphical methods”, Journal of the American statistical association, 79(387):531-554.
- Cleveland W. and Devlin S. (1988), “Locally weighted regression: an approach to regression analysis by local fitting”, Journal of the American Statistical Association 83(403):596-610.
- Cleveland, W., M. McGill, and R. McGill (1988), “The shape parameter of a two- variable graph”, Journal of the American Statistical Association, 83(402): 289–300.
- Cleveland W., E. Grosse and W. Shyu, (1992) “Local regression models.” Statistical models in S: 309-376.
- Cleveland W. (1993), “A Model for Studying Display Methods of Statistical Graphics”, Journal of Computational and Graphical Statistics 2(4): 323–343.
- Fan, J. (1992) “Design-adaptive nonparametric regression”. Journal of the American Statistical Association 87: 998–1004.
- Friendly M. (2002), “Visions and re-visions of Charles Joseph Minard”, Journal of Educational and Behavioral Statistics. 27 (1), 31–52.
- Guha, S. and W. Cleveland (2011), “Perceptual, mathematical, and statistical properties of judging functional dependence on visual displays”, Technical report, Purdue University Department of Statistics.
- John J. and N. Draper (1980) “An alternative family of transformations”, Applied Statistics. 29: 190-197.
- Johnson N. (1949) “Systems of frequency curves generated by methods of translation”, Biometrika, 36: 149-176.
- Healey, C.G. and J.T. Enns, Attention and visual memory in visualization and computer graphics. IEEE Trans Vis Comput Graph, 2012. 18(7): p. 1170-88.
- Lord, F. (1967) “A paradox in the interpretation of group comparisons”, Psychological Bulletin, 68(5):304-305.
Makles A. (2012), “”Stata tip 110: How to get the optimal k-means cluster solution”, The Stata Journal, 12(2):347-351. - Matejka J. and Fitzmaurice G. (2017). “Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing”. Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems: 1290–1294.
- Meijering E. (2002), A chronology of interpolation: From ancient astronomy to modern signal and image processing. Proceedings of the IEEE 90: 319–342.
- Talbot J., Gerth J. and P. Hanrahan (2012), “An empirical model of slope ratio comparisons”, InfoVis Conference Proceedings.
- Tufte E. (1990), “”Data-Ink Maximization and Graphical Design”, Oikos, 58(2):130–144.
- Tukey, J. (1957) “On the comparative anatomy of transformations”, Annals of Mathematical Statistics, 28: 602-632.
- Velleman P. (1980) “Definition and comparison of robust nonlinear data smoothing algorithms”, Journal of the American Statistical Association 75: 609–615.
- Wainer, H. (2003), “Visual Revelations: A Graphical Legacy of Charles Joseph Minard: Two Jewels from the Past”, Chance, 16(1):58–62.
- Wainer H., Njue C. and Palmer S. (2000), Assessing Time Trends in Sex Differences in Swimming & Running, Chance, 13(1):10–15.
- Yeo, I. and R. Johnson (2000) “A new family of power transformations to improve normality or symmetry”, Biometrika, 87: 954-959.
Books
- Albers, Josef (1963), Interaction of Color, Yale University Press.
- Bertin, Jacques (1983). Semiology of Graphics. Madison, Wisconsin: University of Wisconsin Press. (original published in 1967)
- Bertin, Jacques (1981). Graphics and Graphic Information Processing. Elmsford, N. Y.: Walter de Gruyter.
- Cairo A. (2016), The Truthful Art: Data, Charts, and Maps for Communication, New Riders.
- Cairo A. (2012), The Functional Art: An Introduction to Information Graphics and Visualisation, New Riders.
- Card S., Mackinlay J. andB. Shneiderman (1999), Readings in Information Visualization: Using Vision to Think, Morgan Kaufmann Publishers.
- Chambers J., Cleveland W., Kleiner B., and Tukey P. (1983), Graphical Methods for
- Data Analysis. Belmont, CA: Wadsworth.
- Cleveland W. (1993), Visualizing Data, Summit, NJ: Hobart.
- Cleveland W. (1994), The Elements of Graphing Data, Summit, NJ: Hobart.
- De Boor, C. (2001). A Practical Guide to Splines. Springer.
- Everitt, B. S. (1993), Cluster Analysis, 3rd ed. London: Arnold.
- Everitt, B. S., S. Landau, M. Leese, and D. Stahl. (2011). Cluster Analysis, 5th ed. Chichester, UK: Wiley.
- Eubank, R. (1999), Nonparametric Regression and Spline Smoothing, 2nd ed. New York: Dekker.
- Fan J., and I. Gijbels (1996), Local Polynomial Modelling and Its Applications, London: Chapman & Hall.
- Goodall, C. (1990) “A survey of smoothing techniques”, in Modern Methods of Data Analysis, ed. J. Fox and J. S. Long, 126–176. Newbury Park, CA: Sage.
- Hastie T., Tibshirani R. and J. Friedman (2009), The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2nd ed. New York: Springer.
- Heinrich Julian (2013), Visualization Techniques for Parallel Plots, OPUS, University of Stuttgart
- Hoaglin D., Mosteller F. and J. Tukey (1983), Understanding Robust and Exploratory Data Analysis, New York: Wiley.
- Hoaglin D., Mosteller F. and J. Tukey (1985), Exploring Data Tables, Trends, and Shapes, New York: Wiley.
- Hoaglin D., Mosteller F. and J. Tukey (1991), Fundamentals of exploratory analysis of variance, New York: Wiley.
- Huff, Darrell (1954), How To Lie With Statistics, New York: W. W. Norton & Company.
- Inselberg Alfred (2009) Parallel Coordinates: Visual Multidimensional Geometry and Its Applications,Springer.
- Katz, Joel (2012) Designing Information: Human factors and common sense in designing information, Wiley.
- Kaufman L. and P. Rousseeuw (1990) Finding Groups in Data: An Introduction to Cluster Analysis, New York: Wiley.
- Nightingale, Florence (1858), Mortality of the British army : at home and abroad, and during the Russian war, as compared with the mortality of the civil population in England, https://archive.org/details/mortalityofbriti00lond
- Klee, Paul (1961) The Thinking Eye (The Notebooks of Paul Klee), G. Wittenborn, NY/Lund Humphries, London.
- MacEachren A. and Taylor D. (1994), Visualization in Modern Cartography, Elsevier Science, Tarrytown, NY.
- Munzner, Tamara (2014), Visualization Analysis and Design, AK Peters Visualization Series, A K Peters/CRC Press
- Playfair, William (1786), The Commercial and Political Atlas: Representing, by Means of Stained Copper-Plate Charts, the Progress of the Commerce, Revenues, Expenditure and Debts of England during the Whole of the Eighteenth Century.
- Rendgen, Sandra (2018): The Minard System. The Complete Statistical Graphics of Charles-Joseph Minard. New York, Princeton Architectural Press.
- Scott, D. (2015), Multivariate Density Estimation: Theory, Practice, and Visualization, Hoboken, NJ: Wiley.
- Snow, John (1855), On the Mode of Communication of Cholera, 2nd Ed, John Churchill, New Burlington Street, London, England.
- Silverman, B. (1986), Density Estimation for Statistics and Data Analysis. London: Chapman & Hall.
- Simonoff, J. (1996). Smoothing Methods in Statistics. New York: Springer.
- Slocum T., McMaster R., Kessler F. and H. Howard (2008) Thematic Cartography and Geovisualization, 3rd edition, Prentice Hall.
- Tufte, Edward (1990), Envisioning Information, Cheshire, CT: Graphics Press.
- Tufte, Edward (1997), Visual Explanations: Images and Quantities, Evidence and Narrative, Cheshire, CT: Graphics Press.
- Tufte, Edward (2001), The Visual Display of Quantitative Information, 2nd edition, Graphics Press.
- Tukey, John (1977), Exploratory Data Analysis, Reading, MA: Addison-Wesley.
- Velleman P., and D. Hoaglin. (1981), Applications, Basics, and Computing of Exploratory Data Analysis, Boston: Duxbury.
- Wainer, Howard (1997), Visual Revelations: Graphical Tales of Fate and Deception from Napoleon Bonaparte to Ross Perot (2nd edition, Hillsdale, N. J.: Lawrence Erlbaum Associates, 2000 ed.), New York: Copernicus Books.
- Wainer, Howard (2005), Graphic Discovery: A Trout in the Milk and Other Visual Adventures. Princeton, N.J.: Princeton University Press.
- Wainer, Howard (2009). Picturing the Uncertain World: How to Understand, Communicate and Control Uncertainty through Graphical Display. Princeton, NJ: Princeton University Press.
- Wainer, Howard (2016). Truth or Truthiness: Distinguishing Fact from Fiction by Learning to Think like a Data Scientist. New York: Cambridge University Press.
- Ware, C. (2004), Information Visualization: Perception for Design, 2nd edition, Morgan Kaufmann, San Francisco.
- Ware, C. (2008), Visual Thinking for Design, Morgan Kaufmann, Burlington, MA.
- Wilkinson,Leland (1999), The Grammar of Graphics, Springer.
Websites
- Beyond Words Studio https://beyondwordsstudio.com
- Alberto Cairo’s http://www.thefunctionalart.com
- Axis Maps https://www.axismaps.com/team
- Nathan Yau’s www.flowingdata.com
- Robert Kosara’s www.eagereyes.org
- Michael Friendly’s www.datavis.ca
- Michael Bostock’s https://bost.ocks.org/mike
- Cynthia Brewer’s Color Brewer 2.0 http://colorbrewer2.org
- Martin Krzywinski’s http://mkweb.bcgsc.ca
Online tools
- Google charts: https://developers.google.com/chart
- Visualisation cheat sheets using Stata: https://geocenter.github.io/StataTraining
- Plot.ly is a browser-based tool using Python, Javascript or R: https://plot.ly
- D3.js is a JavaScript library for data visualisation: https://d3js.org
- Raw graphs for hierarchical structures: http://rawgraphs.io
- Carto for creating dynamic maps: https://carto.com
- Datawrapper for creating interactive embeddable charts and maps: https://www.datawrapper.de
- Timeline JS for making embeddable timelines using Google spreadsheets: https://timeline.knightlab.com
Data repositories
- 538 website
- AirBnB
- Amazon public datasets
- Australian Bureau of Statistics (ABS)
- Australia Government open data
- Gephi datasets
- Github Awesome Public Datasets
- Kaggle
- London open data
- Network repository
- New York City open data
- OECD data bank
- Programmable web APIs
- Tableau public datasets
- OECD data bank
- UK Government open data
- United Nations data bank
- University of California Irvine Machine Learning datasets
- US Center for Disease Control and Prevention
- US Government open data
- World Bank
Back to [0,1] variables ⟵ ⟶ Start again from About