Into the exclusion zone

Into the exclusion zone

I was born in 1986, and one of the names always being associated with this year is Chernobyl. Yesterday I had the ambivalent “pleasure” to visit parts of exclusion zone: the buffer zone, the Pripyat zone and the 10K zone in which the reactor is located. Towards the end of 2017 a new shelter, or sarcophagus, was put in place over the reactor. This, together with the old shelter and the work of 500.000 liquidators after the catastrophy, makes the zone a safe place to visit for a shorter timespan, and today it’s grown an industry around taking people into the zone and show them around.

Following a test run on the reactor on 26th April reactor 4 at the Chernobyl power plant exploded sending a huge amount of radioactivity into the sky. The cloud created by the accident first blew north and then westward increasing radioactivity levels throughout the Soviet union and the rest of Europe. The accident became known in the West as radioactivity levels were picked up outside the Soviet union, but within Soviet it was kept as a secret and the consequences were played down. In fact, already on the 1st of May, a few days later, the mayday celebration in Kiev, a few hours by car to the south, went uninterrupted.

At the same time, a huge operation were initiated to get control over the fire and constant radiation being spread by the accident. At first fire men tried to take out the fire without any protective gear. At this point any direct exposure was lethal. Later helicopters were used to throw sand, clay and later lead into the fire. Still this job was leathal. Much was done; water was drained from under the reactor to avoid a fatal reaction which would leave large parts of Europe uninhabitable, in the area around the reactor villages were emptied and buildings demolished, Pripyat of around 50.000 people were evacuated and people were to never move back. Above the reactor a sarcophagus was erected. First by robots as the radiation was leathal from as little as 20 seconds exposure, but as some of the robots malfunctioned humans were used as well; bio-robots covered in makeshift protective gear. The human tragedies, still continuing to our day, was and are immense. About 500.000 people were involved in the clean up, many exposed to large amounts of radiation.

The first sarcphagous was built to last for around 30 years, the new will last for another 100 years, but this area will stay uninhabitable for thousands of years.

Polar adventure

Polar adventure

From a long weekend trip to Svalbard during June 2017. Pictures are from Longyearbyen and from travels to Russian mining town Pyramiden and across the Adventsfjord. 

Fem turer i nærheten av Oslo

Fem turer i nærheten av Oslo

Kongens utsikt

En kort kjøretur fra Oslo, langs Tyrifjorden, finner du Kongens utsikt. En enkel, tilrettelagt gåtur på rundt 1,5 km fra parkeringsplassen ved Krokkleiva og fram til utsiktsplattformen. Ved utsiktsplattformen får man en fin utsikt over Tyrifjorden. 

Tyrifjorden sett fra Kongens utsikt

Hvordan komme seg dit?

Ta E18 ut av Oslo i retning Kristiansand. Ved Sandvika, ta av i retning Hønefoss. Ta av ved Sundvollen og kjør opp til Kleivstua. Det poster per juli 2017 30 kroner i bompenger. Parker ved Kleivstua og følg merket sti.

Jæløya

Flott øy utenfor Moss. Flere attraksjoner, og gårdsutsalg på Øya. Kunst og  gode kanelboller på Galleri F15, og mulighet til å gå ned til vannet. På Jeløya ligger også Refsnes gods som har god mat, og flotte omgivelser. Det 250 år gamle  Godset, som nå er hotell er en del av de de historiske, og har flere Munch malerier samt verker av Jakob Weidemann og Andy Warhol. Hotellet er også tildelt Olavsrosa. En tur ut til Jeløy Radio er også å anbefale for utsikten og for å kunne se radioinstallasjonen. 

Galleri F15
Inngangsdørene til hovedbygget på Jeløya radio
Radiomaster på Jeløya radio

 

Hvordan komme seg dit?

Kjør E6 mot Göteborg, ta av mot Moss, og følg skilting. 

Spiralen i Drammen

På jakten over fine steder å besøke i nærheten av Oslo kom jeg ved en tilfeldighet over Spiralen i Drammen. Spiralen er en tunnel formet som en spiral opp igjennom fjellet. Opprinnelig laget for å ta ut steinmasse, ble Spiralen i 1961 åpnet for trafikk. Etter å ha kjørt Sprialen kommer man direkte ut i Drammensmarka, og det er flere fine gåturer man kan starte fra toppen av Sprialen. Det er også bevertning på toppen.

Drammen sett fra toppen av Spiralen

Hvordan komme seg dit? 

Kjør E18 mot Kristiansand, når du kommer inn mot Drammen: ta avkjørsel 24-Brakerøya-krysset i retning Drammen N/Rv283. Ved sykehuset, finn og ta Eivind Olsens vei til toppen. Det er en liten bomavgift for å kjøre spiralen.

Nordmarka langs Gjøvikbanen: Barlidsåsen fra Slippen

Turene hittil har vært enklest å gjennomføre med bil, men nå kommer to turer i Oslomarka hvor den absolutt beste reisemåten er tog. Den første turen, Barlidsåsen fra Slippen, er en enkel tur som egner seg til fots. Turen over åsen på umrerket, men tydelig merket sti er rundt 2 km lang og gir god utsikt på begge sidene av åsen. Mot slutten av åsen, kommer man til en blåsti. Her går jeg vanligvis til venstre ned mot Styggedalen. Videre kan man gå mot Solemskogen, Linderudkollen, Kjelsås eller hvor man måtte ønske i Lillomarka. For å få en tur på rundt 10 km har jeg fulgt blåsti videre mot Lilloseter, og deretter ned til Grorud eller Romsås.

Hvordan komme seg dit?

Fra Oslo, ta Gjøvikbanen (stasjoner i sentrale strøk: Oslo S, Tøyen, Grefsen, Nydalen eller Kjelsås) til Slippen. Slippen er første stopp etter Kjelsås. Ikke alle tog langs Gjøvikbanen stopper her. 

Nordmarka langs Gjøvikbanen: Sykkeltur eller skitur fra Stryken.

En klassisk dagstur i marka, og like fin med Sykkel om sommeren som på ski på vinteren. Turen starter med litt motbakke, men etter et kort stykke går det mest nedover. Omtrent halvveis på turen er det mulighet for å stoppe på Kikut hvor det er bevertning. Det er godt skiltet hele veien, og det er grusvei for sykling og oppkjørte løyper på vinteren. Da skiløypene går over vann, er det også litt variasjon i hvor løypa går om vinteren og sommeren. Det er mange fine steder å stoppe underveis, og man kan selv velge hvor man ønsker å komme ut av skogen. Et greit rutealternativ er Stryken, Kikut, Ullevålseter, Sognsvann. 

På veien passerer man Hakkloa, hvor man f.eks. kan gå til demningen i sørenden av innsjøen.

 

Hvordan komme seg dit?

Fra Oslo, ta Gjøvikbanen (stasjoner i sentrale strøk: Oslo S, Tøyen, Grefsen, Nydalen eller Kjelsås) til Stryken. Dersom toget ikke stopper på Stryken, kan man enkelt sykle til Stryken fra Harestua. 

 

Load database from CSV with columns and tables read from file.

Load database from CSV with columns and tables read from file.

The issue at hand here is to load a database with CSV-files in a generic way without having to write separate script for each table.

This short Groovy-script is working based on two conventions:

  • The name of the CSV-file matches the name of the table in the database.
  • The name of the CSV-header matches the name of the column in the table.

The script requires the CSV-files to be read in the folder “to_digest”.

The connection details is placed in a property file, which can be added to the git ignore file if the script is meant to be under GIT version control. In this example configuration variables are stored in the file ‘configuration.properties’ which needs to be in the same folder as the script is executed from.

The property-file contains the following lines:

db_url=jdbc:oracle:thin:@yourdbserver:1521/schema
db_user=yourusername
db_password=yourpassword

Please also note that this script is written for Oracle, and that the Oracle drivers are available in the local Nexus repository as these are not available on the open Internet. This can easily be changed to another database vendor. If you choose to use an Open Source database vendor such as MySQL or Postgres the drivers are available in public maven repositories and you will not need the reference to a local repository through the @GrabResolver annotation.

 

@GrabResolver(name='nexus', root='http://yourorganizationsnexusinstance.you', m2Compatible = 'true')
@GrabConfig(systemClassLoader=true)
@Grapes([
        @Grab(group='com.oracle', module='ojdbc8', version='12.2.0.1'),
        @Grab(group='com.opencsv', module='opencsv', version='3.9')
])


import groovy.sql.*
import com.opencsv.*

import static groovy.io.FileType.FILES


def path = new File("to_digest")
path.traverse(type : FILES, nameFilter: ~/\w*\.csv/) { it ->
    List entries = digestFile(it)
    insertFile(entries.subList(0,1)[0],entries.subList(1,entries.size()),it.name.take(it.name.lastIndexOf('.')))

}


private List digestFile(def path){

    CSVReader reader = new CSVReader(new FileReader(path),(char)';')
    List myEntries = reader.readAll()
    return myEntries
}


private void insertFile(def header, def data, def name){
Properties properties = new Properties()
File propertiesFile = new File('configuration.properties')
propertiesFile.withInputStream {
    properties.load(it)
}
    println name
    Sql conn = Sql.newInstance(properties.db_url,properties.db_user,properties.db_password)
    data.each { rows ->
            String columns = String.join(",", header)
            String values = rows.collect {"'$it'"}.join(",")
            String query = "INSERT INTO ${name} (${columns}) VALUES (${values})"
            conn.execute(query)
    }
}
Lovholm.net back with new layout

Lovholm.net back with new layout

In recent years the activity on lovholm.net has decreased.. unfortunately. Hopefully, I will soon be able to kick-start some new articles, as I have picked up a few new things I want to share with the world over the last few years.

As you may have noticed if you been visiting this page earlier, it does not look quite like it did a few months or five years ago. The last larger update of the page (WordPress engine updates excluded) was back in 2011 (six years ago!? times goes fast) while I was doing my studies in Edinburgh. As the world moves on, I saw it fit to change the visual template, update a few plugins, and do a clean install of the WordPress installation.

That being said: Welcome to the new page!

It may be some glitches, there may be some source code not correctly formatted, and pictures may be missing. Please don’t hesitate to contact me if some content on the page, which you want to read, does not render properly or there are missing pictures etc. Although I will try and go through the most vital parts of the page, the updates and checking are usually done sporadically.

SQL: Setup a Learning Workbench with SQLite and SQuirreL

SQL: Setup a Learning Workbench with SQLite and SQuirreL

Install SQLite

To download SQLite you only need to download a file from the project homepage of SQLite. Select download, and precompiled download for your prefered operating system.

Unzip the folder with the binary, and place it in a folder that is accessible from your command line interface (The terminal on Mac/*nix systems and Powershell or cmd on Windows). This is a folder that is in the path variable. Here is how you can add a folder to the path on several operating systems.

Open the terminal, and type sqlite3.exe.

Once you have installed SQLite in a folder accessible from the path, you should get this prompt when starting the program.
Once you have installed SQLite in a folder accessible from the path, you should get this prompt when starting the program.

 

Download sample dataset

The next step is to download the data we are going to use. Luckily for us, there are many sample databases available online. One of these is the Chinook database. Download the dataset from here. Unzip the database, and find the ‘Chinook_Sqlite.sqlite’ file. Move this to a folder, and start sqlite3.exe with the path to the database as argument. j

Validate the downloaded database by displaying the tables it contain (and then, if you want, run a query to see some content – just for fun.)

Star SQLite with the database, display tables in the database and run a query (text marked in yellow is input).
Star SQLite with the database, display tables in the database and run a query (text marked in yellow is input).

Download SQuirrel SQL, a SQL query tool

You have done great! We now have a functional database with both SQLite and some data we can use. We could stop now, and do what we want with SQLite, but for us to work better it would have been great having a tool beside the CLI to interact with the database.

SQuirreL is just such a tool. An Open Source SQL-workbench with support for several database-systems including MySQL, Oracle, DB2 and SQLite. In this GUI tool you can navigate the database, run queries, display graphs of how tables are related and more.

Find the installation files here. Select the build approprate for your operating system, run the file downloaded and run through the installation.

Connect SQuirreL with your database

Eu Wern Te has written a short guide to how you can install a JDBC-driver for SQLite, and connect SQuirreL to the sqlite-database. To summarize, you will need to download the JDBC-driver, add this to SQuirreLs lib-folder. When this is done: open SQuirreL, add the driver and then make an alias where you in the URL-field insert the path to the SQLite-database. You will need to fill out several fields, so check out the guide.

Once you have connected SQuirreL with the datasbase, you can connect and communicate with the database through this tool.

 

SQuirreL_and_SQLite

Bestille togreise i India

Bestille togreise i India

Å bestille en togreise i India kan være litt tricky, men er absolutt verdt å gjøre. Det er et enormt tognettverk mellom de forskjellige byene og regionene, og stor spennvidde i hvor god plass og service du får og hvor lang tid togene benytter. Ikke minst er det en flott måte å se landet mellom byene.

Under vårt India-opphold benyttet vi Rajdhani (betyr hovedstad) mellom Mumbai og New Delhi og fikk på førsteklasse egen lugar, te, kaffe og tre-retters middag. Mellom New Delhi og Agra reiste vi med 2. klasse og hadde ingen egen lugar, men fikk egne senger og kunne begrense innsyn ved hjelp av gardin. Når du skal bestemme deg for togavgang og klasse er det to ting det kan være lurt å tenke på:

  1. Prioritet i jernbanenettet. Noen avganger får høyere prioritet enn andre, og reiser derfor raskt og enkelt igjennom verdens mest komplekse nettverk av ruter og stasjoner. For eksempel går Rajdhani-avgangene alltid på tid, og har lite forsinkelser og problemer.
  2. Komfortnivå. Det er mange kupé og billettklasser på indiske tog. Skal du reise et par timer kan det være morsomt og rimelig å reise i en enklere billettkategori, men dersom du skal overnatte eller reise langt kan det være godt å ha større plass i en luftkondisjonert vogn.

Bestillingsprosedyre

Da vi bestilte togreiser i India gjorde vi det igjennom Cleartrip. Det kan ofte være stor interesse for enkelte togavganger og pågangen er stor, derfor kan det være lurt å bestille billett god tid i forveien, men med en søkemotor, slik som Cleartrip, kan du finne alternative ruter dersom hovedforbindelsen er utsolgt. Det kan ta et par dager å bestille første reise igjennom Cleartrip ettersom du trenger en verifisering av e-post og telefonnummer for å kunne gjennomføre et kjøp igjennom det indiske togselskapet IR sitt reisebyrå IRCTC. For å gjennomføre kjøp av våre billetter gikk jeg igjennom denne gode guiden hos Indiamike. Det gode med denne gjennomgangen og Cleartrip er at når du har fått knyttet din Cleartrip-konto til IRCTC så vil du kunne bestille nye togbilletter uten store problemer. Det kjedelige med denne fremgangsmåten er at den krever at du har et pass og kan scanne dette.

Dersom du skulle ha noen problemer, ta kontakt.

A short script for testing writing many files to a folder

A short script for testing writing many files to a folder

The challenge: We want to see when the number of files in a folder decrease the performance on adding new files into the same folder. Two examples where we may need to do to this are: to get an overview of the performance of the file system node structure, or to test Windows function for 8dot3 format compatibility.

The solution: We want to create a script that writes a large amount of files to the folder in question and is logging the time taken at specific milestones. The records logged from the execution of this script can give us a time on how long it takes to write the number of files until the milestones are reached, and from this we can infer how efficient the file system is at writing files between the different milestones.

Example of output
A graph representing the number of files created over time. The X axis convey the number of seconds elapsed, and the Y axis the number of files created. How does your function look like?

The implementation: I’ve chose to set the creation of new files in a for loop which runs N times based on user input. The loop will start, open a new file with an incremental file name, write the payload to the file, and finally close the file and increment the loop counter.

Wrapped around this core functionality, we need to define into which folder the files will be created, what data is to be read and written. We need to read the defined data into a variable (we don’t want to attach too much overhead by reading the data-to-write for each iteration), create a test-folder if this is not already excising. In addition we need a function to write the timestamp, and the iteration number to a file.

To open for multiprocessor testing I’ve also add a loop for spawning new processes and passing on the data about the number of files, and to test for more scenarios e.g. renaming and deleting files, more actions have been added.

The actions, the test folder path, the input file and the number of files and processors are something which the user most likely will change frequently, so instead of keeping this hard coded in the code this is branched out to be provided by the user as command line arguments. As always when dealing with command line arguments: provide good defaults, the user is often likely not to use all the parameters editable.

From description to code this will look something like this:

import time
import os
import string
import random
from multiprocessing import Process
import multiprocessing
import optparse
import os.path

def main(files_each=100, processes=10, actions="a", log_interval=100, temp_path="temp_files", infile="infile.txt"):
  path = temp_path
  check_and_create_folder_path(path)
  for i in range(processes):
    p = Process(target=spawnTask, args=(path, files_each, actions, log_interval, infile))
    p.start()

def print_time_delta(start_time, comment, outfile=False):
  if not outfile:
    print(comment," | ",time.time() - start_time, " seconds")
  else:
    with open(outfile, 'a+') as out:
      out.write("{0} | {1} \n".format(time.time() - start_time, comment))

def spawnTask(path,files_each, actions,log_interval, infile):
  start_time = time.time()
  content = read_file_data(infile)

  print_time_delta(start_time,"creating files for process: "+str(os.getpid()))
  created_files = createfiles(files_each, content,path,start_time, log_interval)
  if(actions == 'a' or actions == 'cr'):
    print_time_delta(start_time,"renaming files for process: " +str(os.getpid()))
    renamed_files = rename_files(created_files,path,start_time, log_interval)
  if(actions == 'a'):
    print_time_delta(start_time,"deleting files for process: "+str(os.getpid()))
    delete_files(renamed_files,path,start_time, log_interval)

  print_time_delta(start_time,"operations have ended. Terminating process:"+str(os.getpid()))

def createfiles(number_of_files, content,path,start_time, log_interval):
  own_pid = str(os.getpid())
  created_files = []
  for i in range(number_of_files):
    if (i % log_interval == 0):
      print_time_delta(start_time, str(i)+" | "+own_pid+" | "+"create","prod_log.txt")
      filename = "wordfile_test_"+"_"+own_pid+"_"+str(i)+".docx"
      created_files.append(filename)
      with open(path+"\\"+filename,"wb") as print_file:
        print_file.write(content)

  print_time_delta(start_time, str(number_of_files) +" | "+own_pid+" | "+"create","prod_log.txt")

  return created_files

def rename_files(filenames,path,start_time, log_interval):
  new_filenames = []
  own_pid = str(os.getpid())
  i = 0
  for file in filenames:
    if (i % log_interval == 0):
      print_time_delta(start_time, str(i)+" | "+own_pid+" | "+"rename","prod_log.txt")
      lst =[random.choice(string.ascii_letters + string.digits) for n in range(30)]
      text = "".join(lst)
      os.rename(path+"\\"+file,path+"\\"+text+".docx")
      new_filenames.append(text+".docx")
      i += 1

  print_time_delta(start_time, str(len(new_filenames))+" | "+own_pid+" | "+"rename","prod_log.txt")

return new_filenames

def delete_files(filenames,path,start_time, log_interval):
  num_files = len(filenames)
  own_pid = str(os.getpid())
  i = 0
  for file in filenames:
    if (i % log_interval == 0):
      print_time_delta(start_time, str(i)+" | "+own_pid+" | "+"delete","prod_log.txt")
      os.remove(path+"\\"+file)
      i += 1
      print_time_delta(start_time, str(num_files)+" | "+own_pid+" | "+"delete","prod_log.txt")

def check_and_create_folder_path(path):
  if not os.path.exists(path):
    os.makedirs(path)

def read_file_data(infile):
  with open(infile,"rb") as content_file:
    content = content_file.read()
  return content

if __name__ == "__main__":
  multiprocessing.freeze_support()
  parser = optparse.OptionParser()
  parser.add_option('-f', '--files', default=100, help="The number of files each process should create. Default is 100")
  parser.add_option('-p', '--processes', default=10, help="The number of processes the program should create. Default is 10")
  parser.add_option('-a', '--action', default='a', help="The action which the program should perform. The default is a.\n Opions include a (all), c (create), cr (create and rename)")
  parser.add_option('-l', '--log_interval', default=100, help="The interval between when a process is logging files created. Default is 100")
  parser.add_option('-t', '--temp_path', default="temp_files", help="Path where the file processes will be done")
  parser.add_option('-i', '--infile', default="infile.txt", help="The file which will be used in the test")

  options, args = parser.parse_args()
  main(int(options.files), int(options.processes), options.action, int(options.log_interval), options.temp_path, options.infile)

 

 

sample_from_output

The output from running this script will be a pipe separated (‘|’) list with seconds, number of files, the process ID (since we enable the program to spawn and run similar processes simultaneously we need to have a way to identify the processes) and actions. This will look like the image below, and from this number you can create statistics on performance at different folder sizes.

The idea of performing this analysis and valuable feedback in the process came from great colleagues at Steria AS.  Any issues, problems, responsibilities etc. with the code or text are solely my own. Whatever you use this information to do, try out or anything is solely your own responsibility.

The folder image is by Erik Yeoh and is released under a Creative Commons Attribution-NonCommercial-ShareAlike License. The image can be found on Flickr.

Two Good Tools for Peeking Inside Windows

Two Good Tools for Peeking Inside Windows

Over the last couple of weeks I have explored the inner mechanics of Microsoft Windows, and the processes that run in this context. In this process two tools have proved especially useful: Xperf logs with WPA and Sysinternal’s Process monitor.

Xperf/WPA

During execution Windows is continuously surveying it’s internal processes through the Event Tracing for Windows (ETW). We may harness this internal surveying by extracting the data through the program Xperf. From specific points in the execution (which we decide when we initially start the logging) we can sample what is happening within the program and in the system as a whole.Two good features I have utilized from  Xperf/WPA combo are:

  •  Process Images We can which ‘.dll’-images are loaded by each process, and when they are loaded.
  •  We can view, over time, what system resources such as memory. CPU, and IO are used by each process or by the system as a whole.

Both these tools are included in the Windows Performance Toolkit which again is part of the Windows Assessment and Deployment Kit. You can during the assessment and deployment kit installation choose to install only the performance toolkit.

To record a session you need to call Xperf twice from the command shell: first to start the logging with specific flags to point out from which internal flags a sample should be made, then secondly to stop the logging and print the results to an .etl file.

A typical start command could be:

xperf -on latency -stackwalk profile

In this example xperf is called with the latency group in kernel mode which is looking at the following flags: PROC_THREAD+LOADER+DISK_IO+HARD_FAULTS+DPC+INTERRUPT+CSWITCH+PROFILE. The stackwalk options provides a stack for the flags or group provided. For a complete list of kernel flags you can use the “xperf -providers k” command.

Once you have started Xperf and performed the action you wanted to record, you may stop xperf with this command

xperf -d output.etl

The -d option is explicitly telling xperf to append the logged session to the file output.etl (or create this file if not existing). The command also implicitly tells the logging session to stop.

For full overview over the commands accepted by Xperf, please refer to the Xperf options documentation at MSDN.

To analyze an .etl file, and the data that has been collected in the logging session, Microsoft has made available a good tool: Windows Performance Analyzer.

Windows Performance Analyzer is a part of the Windows Performance Toolkit.
Windows Performance Analyzer is a part of the Windows Performance Toolkit.

This neat tool provides small views for viewing the genral KPI for the resources to the left, and all of the main resources has expandable menus for more detailed views. Double clicking, or right clicking and selecting to open the section in the main window opens a more detailed overview in the right view of the application. Here the user can go into detailed depth of the applications. In the screenshot you can see the images loaded by the relatively simple command line application Ping.

Process Monitor

The Sysinternal toolkit contains many useful tools for various Windows-related tasks among others the ability to see the activities of a process over time. The latter is straight in the domain of the Process Monitor. With this convenient tool you can get an overview over what operations including registry queries, use of file resources and loading of images a process is doing.

With the Process Monitor you can surveying the system as a whole, and also filter for a specific process. The program traces many of the calls the program is making to the system, and you can use this trace to see in what sequence a program is executing and also which and what kind of system resources it relies on. The combination Xperf and WPA could give an good overview over the images loaded by a process, and with the Process Monitor you may expand this knowledge with Registry queries and Network calls, you can also look at when different profiling actions are called.

Process Monitor from the Sys internal Suite is a good tool to scrutinize what is happening with one or more process.
Process Monitor from the Sys internal Suite is a good tool to scrutinize what is happening with one or more process.

Process Monitor is used both for recording a trace, and for analyzing this afterwards. The traces can be saved to file. They can also be conveniently filtered through the filter functionality, either on the specific types of actions performed by a process (registry, file system, network resources, process and thread activity and profiling), using the symbols to the right of the menu. There is also a filter functionality, displayed in the image by the overlying window, here a good rule to make is to exclude all the actions not associated to the process which you want to survey.

Be advised that Process Monitor records a huge amount of actions. It can be a good idea to turn off recording when you not intend to record, and this can be achieved by toggling the magnifying glass in the menu.

An advantages of the programs in the Sysinternal toolkit, Xperf and WPA is that they do not need to be installed to work. All these tools can be put on a USB stick, and with some training you have suddenly become an one-man-army ready for examining the inside out of Windows.

The image used to illustrate this blog post is by Julian E…, it’s found through Flickr and shared under a Creative Commons by-nc-nd license. 

Work programmatically with Google Spreadsheets Part 2

Work programmatically with Google Spreadsheets Part 2

A while back I wrote a short post on how you can write and read to Google Spreadsheets programmatically using Python and the package ‘gspread’.

Last time the reading was done by first creating arrays with the addresses to where the values could be found in the spreadsheet, and then run through all the values and replace the addresses with the values. It worked fine, but it’s not best practice or very efficient as it makes many single requests on the API. In part two, I will share a short tip on how to read the values in one go instead of iterating through a range of values.

Here is the excerpt dealing with retrieving values. (NB: see original blogpost for gspread initialization).

#Running through the values
get_val = []
set_name = []
set_country = []
for a in range(2,5124):
v = "B%s" % a
sn = "H%s" % a
sc = "G%s" % a
get_val.append(v)
set_name.append(sn)
set_country.append(sc)

for a in range(2,5124):
try:
name = worksheet.acell(get_val[a]).value
res = getCountry(name)
if res:
print res
country, last_id, name = res
worksheet.update_acell(set_name[a], name)
worksheet.update_acell(set_country[a], country)
except Exception as e:
print e

In a recent script we only wanted to download values from a Google spreadsheet (yes, we could have exported the files to .csv with similar result, but with a script we may expand and parse if needed), and this gave some time for refactoring the code as well.

The gspread function worksheet.get_all_values() returns a list of lists with the values. The outer list contains the rows, and the row list contains the specific value of the column at the numerical value for the column. In this example num_streams is the second column, and the position is hence [1] as the list starts at zero.

Also note the nifty way of writing utf-8 formatted strings to the file. UTF-8 can often cause an headache, but put a “u”-character before the string and open the stream with codecs.open(“filename”,”mode”,”encoding”).

The new way of retrieving data from a Google Docs Spreadsheet:

# -*- coding: UTF-8 -*-
import gspread
import codecs

# Global variables for accessing resource
G_USERNAME = 'user_email'
G_PASSWORD = 'password'
G_IDENTIFIER = 'spreadsheet_identifier'

# Connecting to the data source
gc = gspread.login(G_USERNAME,G_PASSWORD)
sht1 = gc.open_by_key(G_IDENTIFIER)
worksheet = sht1.get_worksheet(0)

all_val = worksheet.get_all_values()

output = codecs.open('output_norwegian_artists.csv','wb', "utf-8-sig")

for l in all_val:
num_streams, artistid, name = (l[1],l[2],l[3])
norwegian = l[4]
if len(norwegian) < 3:
norwegian = 'NULL'

string = u"{0};{1};{2};{3}\n".format(num_streams, artistid, name, norwegian)
output.write(string)

output.close()

 

Picture licensed under a creative commons attribution license by the brilliant organization Open Knowledge Foundation. Picture retrieved through a CC-search on Flickr