Welcome to geog0111: Scientific Computing

impart an understanding of scientific computing
give students a grounding in the basic principles of algorithm development and program construction
to introduce principles of computer-based image analysis and model development

It is open to students from a number of MSc courses run by the Department of Geography UCL, but the material should be of wider value to others wishing to make use of scientific computing.

The module will cover:

Computing in Python
Computing for image analysis
Computing for environmental modelling
Data visualisation for scientific applications

Learning Outcomes

At the end of the module, students should:

have an understanding of the Python programmibng language and experience of its use
have an understanding of algorithm development and be able to use widely used scientific computing software to manipulate datasets and accomplish analytical tasks
have an understanding of the technical issues specific to image-based analysis, model implementation and scientific visualisation

Timetable

The course takes place over 10 weeks in term 1, in the Geography Department Unix Computing Lab (PB110) in the Pearson Building, UCL.

Classes take place from the second week of term to the final week of term, other than Reading week. See UCL term dates for further information.

The timetable is available on the UCL Academic Calendar

Assessment

Assessment is through two pieces of coursework, submitted in both paper form and electronically via Moodle.

See the Moodle page for more details.

Useful links

Course Moodle page

Python

Python is a high level programming language that is freely available, relatively easy to learn and portable across different computing systems. In Python, you can rapidly develop solutions for the sorts of problems you might need to solve in your MSc courses and in the world beyond. Code written in Python is also easy to maintain, is (or should be) self-documented, and can easily be linked to code written in other languages.

Relevant features include:

it is automatically compiled and executed
code is portable provided you have the appropriate Python modules.
for compute intensive tasks, you can easily make calls to methods written in (faster) lower-level languages such as C or FORTRAN
there is an active user and development community, which means that new capabilities appear over time and there are many existing extensions and enhancements easily available to you.

For further background on Python, look over the material on Advanced Scientific Programming in Python or python.org web sites.

We are assuming that you are new to computing in this course. We will not explicitly go through unix (operating system) notes, but you should make yourself familiar with the basic concepts.

Using the course notes

We will generally use the jupyter notebooks for running interactive Python programs.

You will probably want to run each session and store scripts in your Data (or DATA) directory.

If you are taking this course at UCL, the notes should already have been downloaded to your DATA directory.

If so, then:

cd /directory/to/geog0111
git reset --hard HEAD
git pull

will update the notes (for any changes I make over the sessions).

If you need to download the notes and want to run the session directly in the notebook, you will need to download the course material from github and run the notebook with e.g.:

cd /directory/to/
git clone https://github.com/profLewis/geog0111.git

to obtain the notes.

Using python

We suggest you use the anaconda python distribution. if you are not using the UCL resources (i.e. using your own comnputer), you should download and install an anaconda distribution. If you are using the UCL computers, then it should be there already.

You may also find it of value to have git installed.

Assuming you have a copy of the notes in the directory ('folder') ~/DATA/geog0111 then you can set up a specific 'environment' in which to run these notes:

cd /directory/to/geog0111
conda env create -f environment.yml

If you are updating the notes, and geog0111 already exists, use instead:

cd /directory/to/geog0111
conda env update -f environment.yml

This will create an environment called geog0111 and make sure you have all of the required dependencies.

If you have created the environment, you can activate it with:

conda activate geog0111
python setup.py install

For further advice on checking, setting or deleting conda environments, see the conda help pages.

To go to the directory for the first session:

cd /directory/to/geog0111
jupyter notebook Chapter1_Python_intro.ipynb

You quit an jupyter notebook session with ^C (Control C).

To exectute ('run') blocks of Python code in the notebook, use ^<return> (SHIFT and RETURN keys together).

Alternatively, just run ipython:

cd /directory/to/geog0111
ipython

and type your own commands in at the prompt, following the class or the material on the webpages.

Course Notes

Course notes

Help

Help Connections to the lab

Issues with Chapter 1

In Section 1.1.6, you include lists, and it's also unhelpful to test equivalence and whether the variables all point to the same chunk in memory:

# Comparison examples

# is one plus one list identical to two list?
print ([1 + 1] is [2])   # <----- Lists are new!
                                # <----- Also testing for the pointer being the same is kind of inappropriate at this stage

# is one plus one list equal to two list?
print ([1 + 1] == [2])

In the example before exercise 1.1.15, retval doesn't need to be predefined. Also, a better example might show a more complex if statement, where you can see what code blocks are executed and ignored. the exercise would be to do a truth table of the statement. As written below, running this with my_number = 1...10 would give AABBBBCCCC. It could then be used to introduce loops to show what the effect of the loop is, rather than having strings (1.2.1 and 1.2.2)

my_number = 7

if my_number < 3:
    print (f"{my_number:d} less than 3")
    output = "A"
elif (3 <= my_number < 7):
    print (f"{my_number:d} more than 3 and less than 7")
    output = "B"
elif my_number >= 7:
    print (f"{my_number:d} greather than 7")
    output = "C"
print(output)

I think having students start thinking at how you put things together with simpler problems is a good thing, the webscrapping example is OK for homework, but it might be good to show a quick example of splitting a simple file/text variable first, as it looks quite daunting. Here's an example with a random dataset from the internet eg (original dataset, removed Zaire and Tanzania because of missing values, can be added later if needed)

data_set = """Country	Life Expectancy	People per Television	People per Physician	Female Life Expectancy	Male Life Expectancy
Argentina	70.5	4	370	74	67
Bangladesh	53.5	315	6166	53	54
Brazil	65	4	684	68	62
Canada	76.5	1.7	449	80	73
China	70	8	643	72	68
Colombia	71	5.6	1551	74	68
Egypt	60.5	15	616	61	60
Ethiopia	51.5	503	36660	53	50
France	78	2.6	403	82	74
Germany	76	2.6	346	79	73
India	57.5	44	2471	58	57
Indonesia	61	24	7427	63	59
Iran	64.5	23	2992	65	64
Italy	78.5	3.8	233	82	75
Japan	79	1.8	609	82	76
Kenya	61	96	7615	63	59
Korea, North	70	90	370	73	67
Korea, South	70	4.9	1066	73	67
Mexico	72	6.6	600	76	68
Morocco	64.5	21	4873	66	63
Myanmar (Burma)	54.5	592	3485	56	53
Pakistan	56.5	73	2364	57	56
Peru	64.5	14	1016	67	62
Philippines	64.5	8.8	1062	67	62
Poland	73	3.9	480	77	69
Romania	72	6	559	75	69
Russia	69	3.2	259	74	64
South Africa	64	11	1340	67	61
Spain	78.5	2.6	275	82	75
Sudan	53	23	12550	54	52
Taiwan	75	3.2	965	78	72
Thailand	68.5	11	4883	71	66
Turkey	70	5	1189	72	68
Ukraine	70.5	3	226	75	66
United Kingdom	76	3	611	79	73
United States	75.5	1.3	404	79	72
Venezuela	74.5	5.6	576	78	71
Vietnam	65	29	3096	67	63"""

selected_country = "Turkey"

# Split string by new line characters or `\n`
for iline, line in enumerate(data_set.split("\n")):
    # If the selected country is find in this line, carry on
    if line.find(selected_country) >= 0:
        # Split the line by `\t` character
        (country, life_expectancy, people_per_tv,
             people_per_dr, fem_lif_exp, mal_lif_exp) = line.split("\t")
        # Print some relevant information
        print (f"{selected_country:s} has {people_per_tv:s} people per TV set")

This could be extended to calculate some simple statistics (minimum and maximum values, mean values, that kind of thing), as a longer in class exercise. The advantage is that they can directly look at the data and see whether they got it right or not. They also need to come up with some sort of algorithms, or search around to find about min and max.

countries = []
tvs = []

# Split string by new line characters or `\n`
for iline, line in enumerate(data_set.split("\n")):
    if iline > 0:
        # Split the line by `\t` character
        (country, life_expectancy, people_per_tv,
             people_per_dr, fem_lif_exp, mal_lif_exp) = line.split("\t")
        countries.append(country)
        tvs.append(float(people_per_tv))

### Using loops
        
max_tvs = max(tvs)
min_tvs = min(tvs)

for iloc, tv in enumerate(tvs):
    if tv == max_tvs:
        print(f"Country with most people per TV is {countries[iloc]:s}")
    elif tv == min_tvs:
        print(f"Country with least people per TV is {countries[iloc]:s}")

## Searching in an array. Note comparisons with floating point numbers are dangerous!

print(f"Country with most people per TV is {countries[tvs.index(max_tvs)]:s}")
print(f"Country with least people per TV is {countries[tvs.index(min_tvs)]:s}")

This can be extended into a longer example with lists/dictionaries...

country_dictionary = {}

# Split string by new line characters or `\n`
for iline, line in enumerate(data_set.split("\n")):
    if iline > 0:
        # Split the line by `\t` character
        (country, life_expectancy, people_per_tv,
             people_per_dr, fem_lif_exp, mal_lif_exp) = line.split("\t")
        country_dictionary[country] = float(people_per_tv)

max_tvs = 0
for country, tv_sets in country_dictionary.items():
    if tv_sets >= max_tvs:
        max_tvs = tv_sets
        max_tvs_country = country
print(max_tvs_country, max_tvs)

It might be a good thing to introduce listings of methods for different things, and link to the documentation if they want to find out more about the different methods. Eg for a list

print(dir([1,2,3]))

proflewis / geog0111_old Goto Github PK

geog0111_old's Introduction

Welcome to geog0111: Scientific Computing

Online Notebooks via Binder:

Course information

Course Convenor

Course and Contributing Staff

Purpose of this course

Learning Outcomes

Timetable

Assessment

Useful links

Python

Using the course notes

Using python

Course Notes

Help

geog0111_old's People

Contributors

Stargazers

Watchers

Forkers

geog0111_old's Issues

Recommend Projects

Recommend Topics

Recommend Org