As part of the CoLiTec initiatives, Maja Miličević (University of Belgrade) and Adriano Ferraresi (University of Bologna) held a one-day workshop at the Department of Interpreting and Translation (DIT) offering an introduction to statistical concepts and methods in (corpus) linguistics.
The workshop was an excellent opportunity for anyone wishing to include quantitative analyses in their research. As Maja and Adriano repeatedly warned, statistics can be a bit “tricky” and one needs a firm grasp of quantitative methods to describe and generalise data properly!
After learning how to formulate a hypothesis and organise data accordingly, participants were initiated into the R software and its command line environment. The morning session covered basic descriptive statistics (including frequency distributions, measures of central tendency and measures of dispersion), which were then applied to a dataset. The afternoon session focused on how to generate graphs (scatter plots, bar charts, box plots, mosaic plots), and provided a theoretical and practical introduction to inferential statistics (chi-square test and correlation).
Although R seemed a bit hard to learn initially, participants then familiarised themselves with it and were soon impressed by its potential.