Categories
data-work

Load database from CSV with columns and tables read from file.

The issue at hand here is to load a database with CSV-files in a generic way without having to write separate script for each table. This short Groovy-script is working based on two conventions: The name of the CSV-file matches the name of the table in the database. The name of the CSV-header matches the […]

Categories
data data-work

SQL: Setup a Learning Workbench with SQLite and SQuirreL

Install SQLite To download SQLite you only need to download a file from the project homepage of SQLite. Select download, and precompiled download for your prefered operating system. Unzip the folder with the binary, and place it in a folder that is accessible from your command line interface (The terminal on Mac/*nix systems and Powershell […]

Categories
data-work

Work programmatically with Google Spreadsheets Part 2

A while back I wrote a short post on how you can write and read to Google Spreadsheets programmatically using Python and the package ‘gspread’. Last time the reading was done by first creating arrays with the addresses to where the values could be found in the spreadsheet, and then run through all the values […]

Categories
data-work

API-scrape images from Instagram

An image can say more than a thousand words, especially when you add a retro-filter and a score of hashtags to go with it. That is a basic explanation to the functionality of Instagram; the power app which revolutionised peoples creativity when it came to documenting dietary habits… and popularised images in social media. Instagram […]

Categories
data-work

Prepare Your Data For Analysis: Load Data With Spoon

Note: The instructions described here is learnt and utilized while working on the Clouds & Concerts project at University of Oslo in close collaboration with Xin Jian, Johannes Bjelland og Pål Roe Sundsøy under supervision by Arnt Maasø. The credit for methodology and SQL ninja tricks is therefore to be attributed to all members of […]

Categories
data-work

Useful Terminal Commands for work with data

Please, see my terminal-tools tag for more articles on nifty terminal tools. Enclose values in quotes, and add comma Say you work with a large datafile where you get all values printed on consecutive lines, and you want to copy these lines into an array, list or other data structure where strings need to be […]

Categories
data-work

Safe functions for casting numbers and timestamp in Oracle SQL

In the process of importing data which can be irregular from a flat file it is convenient to import all values as character strings, in this case Oracle’s varchar2, and then cast them into the proper format using the built-in datatype converters as e.g. to_number. The problem occurs when some value which is not convertible […]

Categories
data-work

Screenscrape av Øya-programmet

Forskningsprosjektet Sky & Scene, hvor jeg jobber, ser blant mye annet nærmere på strømmetallene fra WiMP før, under og etter Øya-festivalen. For å gjøre dette trenger vi en liste over hvilke artister som spiller, hvilken dag de spiller og når på dagen de spiller. Før dataene kan analyseres må disse dataene være tilgjengelige i Excel-ark […]

Categories
data-work

Norske illustrasjonskart på fylkes- og kommunenivå med SVG

Vektor- og rastergrafikk SVG – Scalable Vector Graphics – er, som gitt av navnet, et skalerbart vektorbasert grafikk-markup. Vektorgrafikk er bedre egnet til skalering enn raster, da bildet ikke består av små bildeelementer (Pixel – kort for nettopp picture element), men formler for opptegning av grafikken. Nå har det seg riktignok slik at ikke alt […]

Categories
data-work

Work programmatically with Google Spreadsheets

Some time back I authored a script which reads through a CSV formatted list, and based on the artists’ names tried to decide the nationality of the artists by querying the last.FM search engine and parsing the XML structured result. The script worked, and found and returned about 80% of the artists, and around 80% […]