Browsed by
Tag: data-work

Load database from CSV with columns and tables read from file.

Load database from CSV with columns and tables read from file.

The issue at hand here is to load a database with CSV-files in a generic way without having to write separate script for each table. This short Groovy-script is working based on two conventions: The name of the CSV-file matches the name of the table in the database. The name of the CSV-header matches the name of the column in the table. The script requires the CSV-files to be read in the folder “to_digest”. The connection details is placed in…

Read More Read More

Work programmatically with Google Spreadsheets Part 2

Work programmatically with Google Spreadsheets Part 2

A while back I wrote a short post on how you can write and read to Google Spreadsheets programmatically using Python and the package ‘gspread’. Last time the reading was done by first creating arrays with the addresses to where the values could be found in the spreadsheet, and then run through all the values and replace the addresses with the values. It worked fine, but it’s not best practice or very efficient as it makes many single requests on…

Read More Read More

API-scrape images from Instagram

API-scrape images from Instagram

An image can say more than a thousand words, especially when you add a retro-filter and a score of hashtags to go with it. That is a basic explanation to the functionality of Instagram; the power app which revolutionised peoples creativity when it came to documenting dietary habits… and popularised images in social media. Instagram brought social into photography in a way other more desktop-oriented photo sharing applications like Picasa and Flick never managed. It is social and users can…

Read More Read More

Prepare Your Data For Analysis: Load Data With Spoon

Prepare Your Data For Analysis: Load Data With Spoon

Note: The instructions described here is learnt and utilized while working on the Clouds & Concerts project at University of Oslo in close collaboration with Xin Jian, Johannes Bjelland og Pål Roe Sundsøy under supervision by Arnt Maasø. The credit for methodology and SQL ninja tricks is therefore to be attributed to all members of the fantastic stream team. Inconveniences or mistakes are solely my own. Technical set up We chose to use Amazon as a solution for doing our…

Read More Read More

Useful Terminal Commands for work with data

Useful Terminal Commands for work with data

Please, see my terminal-tools tag for more articles on nifty terminal tools. Enclose values in quotes, and add comma Say you work with a large datafile where you get all values printed on consecutive lines, and you want to copy these lines into an array, list or other data structure where strings need to be quoted, and values needs to be separated by commas. Here is a sed script that may help you with that task. sed ‘s/\(.*\)/”\1″,/g’ all_uuid.csv >…

Read More Read More

Sanntid hjemme: Følg Ruter hjemmefra med Arduino og Python

Sanntid hjemme: Følg Ruter hjemmefra med Arduino og Python

Sanntidsskjermene som Ruter har satt opp ved flere t-bane, trikk og bussholdeplasser har blitt et kjærkomment bidrag til informasjonen du som reisende får om avganger. Ikke bare kan du ha stålkontroll over hvilke avganger som går, og i hvilken rekkefølge, du blir også oppdatert om estimert ankomsttid slik at du vet at du mista bussen du løp for å rekke eller om bussen har blitt forsinka i rush-trafikken. I tillegg til skjermene på holdeplassene som opplyser om sanntid, kan du få…

Read More Read More

Screenscrape av Øya-programmet

Screenscrape av Øya-programmet

Forskningsprosjektet Sky & Scene, hvor jeg jobber, ser blant mye annet nærmere på strømmetallene fra WiMP før, under og etter Øya-festivalen. For å gjøre dette trenger vi en liste over hvilke artister som spiller, hvilken dag de spiller og når på dagen de spiller. Før dataene kan analyseres må disse dataene være tilgjengelige i Excel-ark og i CSV-format og i databasen hvor strømmetallene finnes. Dataene må hentes og struktureres i et bestemt format. Et godt utgangspunkt er å samle dataene…

Read More Read More

Descriptive Statistical Methods in Code

Descriptive Statistical Methods in Code

Statistics can be mighty useful, and a little programming helps it getting even better. I often find that I through code can grasp the fundamental function behind how things work, and I have tried to apply this to statistics as well. In this case to descriptive statistics. Join me in this short attempt to let statistics unconceal itself. Getting ready: Extremistan and Mediocristan This little text is looking closer on some of the descriptive methods of statistics, and to do that…

Read More Read More

Work programmatically with Google Spreadsheets

Work programmatically with Google Spreadsheets

Some time back I authored a script which reads through a CSV formatted list, and based on the artists’ names tried to decide the nationality of the artists by querying the last.FM search engine and parsing the XML structured result. The script worked, and found and returned about 80% of the artists, and around 80% of these again had the conceptually similar artist. What if the alteration could be done in the document itself? For the CSV based script the…

Read More Read More

Data-wrangling: find country based on artist name

Data-wrangling: find country based on artist name

At the Clouds & Concerts project at the University of Oslo we are working with really interesting topics, based on interesting empirical data. Through our collaboration with the Norwegian streaming service provider WiMP we are together with Telenor and WiMP analysing a vast collection of data. More about the project’s data-part, also the ‘Clouds’ part of the project’s name can be found on the project’s web sites. Artist and Country One of the tasks at hand was to find out…

Read More Read More