Assignment - Spectral Peaks

Aim

Given a data set representing the energy with respect to frequency, identify the location of any peaks and estimate the intensity (area within peak). The data is noisy and there is an underlying non-linear trend.

Sample spectral data (with 9 peaks).

You will need to

\[ bx + c \qquad\mbox{or}\qquad a x^2 + b x + c \qquad\mbox{or}\qquad a e^{b x} + c \]

Dataset

Each data file consists of two columns (frequency, energy) both are scaled so that

I have two data files that are common to all of you and four individual data files each. The common data files are

The individual data files are determined by your student number. As per GDPR I should not publish your student number so instead you run the following code with your student number.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
import hashlib, os

def list_dataset(id='W00000000'):

    def name_to_seed(s):
        return int(hashlib.sha512(str(s).encode()).hexdigest(),16) % 2**32

    for p in ['A','B','C','D']:
        problem = id.upper() + '-' + p
        seed = name_to_seed(problem)
        filename = f'p-{seed:010d}.txt'

        print(id, p, filename, os.path.isfile(f'data/{filename}'))

All of the images are in the archive.
I have also included some simpler data files which should help the development of your code.

To use the data files in the archive, you can use the following code which copies the archive to the same folder as your notebook and extracts contents to current folder. Note that it creates two folders

1
2
3
4
5
6
7
8
url = 'https://setu-computationalphysics.github.io/live/topics/05-Spectral_Peaks/01-Problem_Specification/files/individual.zip'

import urllib.request 
urllib.request.urlretrieve(url, 'data.zip')

import zipfile
with zipfile.ZipFile('data.zip', 'r') as zip_ref:
    zip_ref.extractall('.')

Required functions

Your notebook should have the function

1
2
def process_data(filename='kmurphy.txt', ADD_YOUR_OTHER_ARGUMENTS_HERE_IF_WANTED):
    pass

which when passed the name of a txt file in data containing the data points:

Output on Generating a Dataset

Each problem in folder data should generate two files in folder output with same basename. The txt file should have contents like

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
Trend:
    -0.826 x^2 +0.437 x +1.188
Number of peaks: 9
    Peak 0: c=0.1531    h=-0.7210    w=0.0152    a=0.0073
    Peak 1: c=0.2067    h= 0.8023    w=0.0161    a=0.0087
    Peak 2: c=0.2981    h=-0.7538    w=0.0162    a=0.0082
    Peak 3: c=0.3662    h=-0.7761    w=0.0165    a=0.0086
    Peak 4: c=0.4095    h=-0.9409    w=0.0156    a=0.0098
    Peak 5: c=0.5164    h= 0.7872    w=0.0178    a=0.0094
    Peak 6: c=0.7771    h= 0.6157    w=0.0188    a=0.0078
    Peak 7: c=0.8169    h=-0.8501    w=0.0118    a=0.0067
    Peak 8: c=0.8675    h=-0.9438    w=0.0128    a=0.0081

and the csv should have contents like

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
centre,height,width,area
0.15313706712831004,-0.7210378048534919,0.015214553810080614,0.007339635054804814
0.20672243434181659,0.8022816492248048,0.016131975744163238,0.0086590775416792
0.2981468738739708,-0.7537656205977101,0.0162392088819803,0.008189519158254182
0.3661868473275811,-0.7761159842494884,0.016479398153909846,0.008557071965497709
0.4094939136295262,-0.9409097426352324,0.01557047009979476,0.009801824000493207
0.5164469575753882,0.7871710197136214,0.01776265756743838,0.009354794467043001
0.7770860565574634,0.6156611198029336,0.018847540823450016,0.007763429801721572
0.8168541310701665,-0.8501072299550835,0.011831154164392762,0.006729116279430779
0.8675404723019733,-0.9437532545518952,0.012808659598865919,0.00808759427537501

Deliverables

Notebook containing code and conclusions describing your implementation.

Submit using Moodle