Working with Data

Karolinska Institutet University Library Flemingsberg

April 1-2, 2020

9:00 am - 4:30 pm

Instructors: Joakim Philipson, Rosa Lönneborg, Thomas Lind,Lina Andrén

Helpers: Glenn Haya

General Information

Where: Alfred Nobels allé 8. The university library, room Hypofysen 1. Get directions with OpenStreetMap or Google Maps.

When: April 1-2, 2020. Add to your Google Calendar.

Requirements: Participants must bring a laptop with a Mac, Linux, or Windows operating system (not a tablet, Chromebook, etc.) that they have administrative privileges on. They should have a few specific software packages detailed in each section of the schedule listed below.

Accessibility: We are committed to making this workshop accessible to everybody. The workshop organizers have checked that:

Materials will be provided in advance of the workshop and large-print handouts are available if needed by notifying the organizers in advance. If we can help making learning easier for you (e.g. sign-language interpreters, lactation facilities) please get in touch (using contact details below) and we will attempt to provide them.

Contact: Please email glenn.haya@ki.se for more information.


Code of Conduct

Everyone who participates in Carpentries activities is required to conform to the Code of Conduct.This document also outlines how to report an incident if needed.


Registration

Registration is open to people affiliated with KI, KTH or SU. Registration is not active yet, we will add a link here when it is available.


Collaborative Notes

We will use this collaborative document for chatting, taking notes, and sharing URLs and bits of code.



Schedule

Day 1

Before Starting Pre-workshop survey
09:00 Databases and SQL
10:30 Morning break
12:00 Lunch break
13:00 Open Refine for Social Science
14:30 Afternoon break
16:00 Wrap-up
16:30 END

Day 2

09:00 Python
10:30 Morning break
12:00 Lunch break
13:00 Python continued
14:30 Afternoon break
16:00 Wrap-up
16:30 Post-workshop survey

Setup

SQLite

SQL is a specialized programming language used with databases. We use a simple database manager called SQLite in our lessons.

  • Run git-bash from the start menu
  • Copy the following curl https://kth-biblioteket.github.io/2019-11-13-carpentry/getsql.sh | bash
  • Paste it into the window that git bash opened. If you're unsure, ask an instructor for help
  • You should see something like 3.27.2 2019-02-25 16:06:06 ...

If you want to do this manually, download sqlite3, make a bin directory in the user's home directory, unzip sqlite3, move it into the bin directory, and then add the bin directory to the path.

SQLite comes pre-installed on macOS.

SQLite comes pre-installed on Linux.

  • In case of problems: register for an account at Python Anywhere
  • Download survey.db
  • Click on files and upload survey.db
  • Click on dashboard and Choose new console $ bash

If you installed Anaconda, it also has a copy of SQLite without support to readline. Instructors will provide a workaround for it if needed.

OpenRefine

For this lesson you will need OpenRefine and a web browser. Note: this is a Java program that requires a Java Runtime Environment on your machine (if lacking, can be downloaded from here: JDK). It runs inside a web browser(Chrome is recommended, Firefox also works ok), but no web connection is needed.

Check that you have either the Firefox or the Chrome browser installed and set as your default browser. OpenRefine runs in your default browser. It will not run correctly in Internet Explorer.

Download software from http://openrefine.org/

Create a new directory called OpenRefine.

Unzip the downloaded file into the OpenRefine directory by right-clicking and selecting "Extract ...".

Go to your newly created OpenRefine directory.

Launch OpenRefine by clicking openrefine.exe (this will launch a command prompt window, but you can ignore that - just wait for OpenRefine to open in the browser).

If you are using a different browser, or if OpenRefine does not automatically open for you, point your browser at http://127.0.0.1:3333/ or http://localhost:3333 to use the program.

Check that you have either the Firefox or the Chrome browser installed and set as your default browser. OpenRefine runs in your default browser. It may not run correctly in Safari.

Download software from http://openrefine.org/.

Create a new directory called OpenRefine.

Unzip the downloaded file into the OpenRefine directory by double-clicking it.

Go to your newly created OpenRefine directory.

Launch OpenRefine by dragging the icon into the Applications folder.

Use Ctrl-click/Open ... to launch it.

If you are using a different browser, or if OpenRefine does not automatically open for you, point your browser at http://127.0.0.1:3333/ or http://localhost:3333 to use the program.

Check that you have either the Firefox or the Chrome browser installed and set as your default browser. OpenRefine runs in your default browser.

Download software from http://openrefine.org/.

Make a directory called OpenRefine.

Unzip the downloaded file into the OpenRefine directory.

Go to your newly created OpenRefine directory.

Launch OpenRefine by entering ./refine into the terminal within the OpenRefine directory.

If you are using a different browser, or if OpenRefine does not automatically open for you, point your browser at http://127.0.0.1:3333/ or http://localhost:3333 to use the program.

Python

More information coming soon.