1. Customization#

Since there are no standard formats for SEC-SAXS data, users need to customize to adapt to their data formats. This chapter will explain briefly how to achieve it. Alternatively, you can join us by opening an issue to support your data formats at the library’s repository.

We suspect that there is a high chance that you may need to customize only for UV data because the formats for X-ray scattering profiles are better unified thanks to ATSAS suite’s popularity while there is no such defacto standard for UV absorbance data.

1.1. Assumption on Data Location#

We assume here that the data sets for UV and X-ray are located in a single folder so that they can be loaded by a unique folder path specification.

Note

In our original implementation, it is assumed that X-ray data set consists of many files with .dat file extension while the UV data set consists of only a single file with .txt file extension. In any case, the loder functions explained here should be implemeneted so that they can load the data by the folder path specification which implies the both data sets. The detailed data specification of our original case is documented in the chapter four of MOLASS User’s Guide.

1.2. Test Data#

Use the data for this tutorial, which can be download data from the following links:

In this data set, the X-ray data set consists of files Compare your data to this data set and follow the instructions below.

1.3. Source Code#

Download or git clone the the library’s repository.

1.4. UV Data Input#

Replace the following function in the molass/DataUtils/UvLoader.py Python file to adapt to your data.

def load_uv(path):
    """
    Load UV data from a directory.

    Parameters
    ----------
    path : str
        Path to the UV data directory.
        
    Returns
    -------
    uvM : np.ndarray
        UV data matrix.
        The first dimension corresponds to the wavelengths, and the second to the frames (elution points).
    wvector : np.ndarray
        Wavelength vector.
    """
    assert os.path.isdir(path)
    # Implement your data loading code
    return uvM, wvector     

To test this function, you can use the test script in tests/specific/900_Custom folder as follows.

cd tests/specific/900_Custom
pytest test_010_Custom.py -s -k test_010_load_uv

Note

Our original implementation of this function can load from either a file path or a folder path for backward compatibility. However, the implementation only for a single file is required in the currecnt use.

1.5. X-ray Data Input#

Replace the following function in the molass/DataUtils/XrLoader.py Python file to adapt to your data.

def load_xr(folder_path):
    """
    Load X-ray scattering data from a folder containing .dat files.
    Parameters
    ----------
    folder_path : str
        Path to the folder containing .dat files.
    Returns
    -------
    xr_array : np.ndarray
        3D array containing the X-ray scattering data.
    Notes
    -----
    The function assumes that each .dat file contains data in a format compatible with np.loadtxt.
    The first dimension corresponds to the number of files, the second to the number of points, and the third to the data columns.
    """
    input_list = []
    for path in sorted(glob(folder_path + "/*.dat")):
        input_list.append(np.loadtxt(path))
    xr_array = np.array(input_list)
    return xr_array

To test this function, you can use the test script in tests/specific/900_Custom folder as follows.

cd tests/specific/900_Custom
pytest test_010_Custom.py -s -k test_020_load_xr