SluitenHelpPrint
Switch to English
Cursus: UCACCMET2J
UCACCMET2J
Data Analysis for Liberal Arts and Sciences
Cursus informatie
CursuscodeUCACCMET2J
Studiepunten (EC)5
Cursusdoelen
After completing this course, students are able to:
- describe to a non-expert what a computer is, what the different operating systems are, broadly how an program or application works, and what a script does
- in an automatic fashion, manipulate large quantities of digital information by being able to search, sort, move, copy and concatenate files
- employ algorithmic approaches to problem solving
- write simple scripts, using simple statements, loops, iteration and regular expressions
- work at a very basic level with languages including Python, R and LaTex
- work at a more-than-basic level with one (or more) of these languages
- understand concepts of unstructured data and structured data, and work with the commonly-used CSV and JSON data-interchange formats;
- work with common tools to acquire publicly available data and share their own data in a structured way;
- perform simple (pre)processing and analysis steps on datasets;
- pose a data-oriented research question and work in a group to solve it and present the results in a clear and elegant fashion;
- most importantly, approach a large dataset with an understanding of what expertise and techniques are necessary for getting the information out of it that is necessary for their research.
Inhoud
This course may replace 1 lab course for Science majors. It is advised against for students majoring in Mathematics, Physics, or Chemistry.

Content
Increasingly, research in all disciplines, from the natural sciences to social sciences and humanities, involves big data. The availability of vast amounts of textual, audio-visual and structured data from digital sources is revolutionising research in the humanities and social sciences. The most advanced scholarship in these areas, currently and in the foreseeable future, relies on the use of sophisticated tools for accessing, processing, analysing and presenting this data.

In this three-week module, we use real-world data to allow students to gain familiarity and experience with some common approaches to handling large datasets. They learn algorithmic thinking and the general concepts required for data analysis with computers. They engage with a number of very common programming languages and tools, culminating in a group project based on a real-world dataset, where they extract relevant information from it in an automated fashion, perform some simple analysis, and display the results visually.

As part of a liberal arts curriculum, this module stimulates the kind of thinking that our college hopes to engender: the use of multiple paradigms to solve problems, drawing on reasoning, logic, analysis, hypothesis-testing, and formal problem-solving methods.

Format
The first two weeks take place on campus at University College. Morning sessions involve introduction to a topic and supervised hands-on computer exercises in the classroom setting. During the afternoons the students work on hand-in assignments, which are returned to the students with feedback by the end of the day. At the end of the second week, the students do a graded individual assignment.
If enough students register, the third week is residential and takes place at a location close to Utrecht where all participants, including teachers, are expected to stay four nights, Monday to Friday. There is a focus on team-forming and project work. This week will include work sessions, presentations, and evening programs related to the theme of the module. At this location, the class, divided into groups, will work on separate projects which, at the end of the week, will be brought together to a symposium where students will present their findings.

Schedule
The course runs for 3 weeks 22 May - 9 June 2017

Online materials
The course will involve some short MOOCs (short, interactive online courses), in particular from codecademy and coursera. Some examples:
- Codecadamy, command line: www.codecademy.com/learn/learn-the-command-line
- Codecadamy, Python: www.codecademy.com/learn/python
- Code School, try R: tryr.codeschool.com
- Codecadamy, GIT: www.codecademy.com/learn/learn-git

Tutorials and Reference Works
Students will be pointed to a choice of tutorials, including the official tutorials given by makers of particular software, as well as simple user-friendly guides. Standard reference texts will be available during the module for students to consult for assignments. Whenever useful, lecture notes and programming cheat sheets will be provided.

Software
We will only be using software that is freely available. Detailed installation instructions will be provided at the start of the course.

Assessment
Assignments from the first two weeks are assessed on a pass/fail basis, where successfully completing an assignment guarantees a pass. If all assignments are passed, students can progress to the third week group project. Students who fail to complete the assignments cannot complete the module. At the end of the second week, there will be a graded individual assignment aimed at connecting all the individual components together and testing the skills that are required during the third-week project. At the end of the third week, groups will present their findings and hand in a research report on the process and obtained results. The report will be assessed based on the research approach and process, and the quality of the final product. The presentation will be assessed based on the clarity and quality of the presentation of the results. Additionally, all students will have to hand in a short personal report about the group project, including reflection on the progress made throughout the course.

Please check out the webpage for more information: bit.ly/ucudatascience

Teachers
Coordinators: Teun van Gils en Lucie Kattenbroek
Teachers/facilitators: Teun van Gils, Edwin van der Helm, Joska de Langen, Lucie Kattenbroek,
David van Leeuwen (unconfirmed), Joris Vincent
SluitenHelpPrint
Switch to English