MLNanoShaper

Documentation for MLNanoShaper. MLNanoShaper is a machine learning algorithm that can compute the surface of proteins. There are multiple ways to interface with the software.

  • As julia Modules MLNanoShaper and MLNanoShaperRunner
  • As a cli command mlnanoshaper in ~/.julia/bin.
  • As a training script script/training.bash that run multiple training runs. Requires parallel.
  • Running only: as a .so object.
MLNanoShaper.AccumulatorLoggerType
accumulator(processing,logger)

A processing logger that transform logger on multiple batches Ca be used to liss numerical data, for logging to TensorBoardLogger.

source
MLNanoShaper.LossTypeType
abstract type LossType end

LossType is an interface for defining loss functions.

Implementation

  • getlossfn(::LossType)::Function : the associated loss function
  • metrictype(::Type{<:LossType)::Type : the type of metrics returned by the loss function
  • getlosstype(::StaticSymbol)::LossType : the function generating the losstype
source
MLNanoShaper.TrainingDataType

Training information used in model training.

Fields

  • atoms: the set of atoms used as model input
  • skin : the Surface generated by Nanoshaper
source
MLNanoShaper.TrainingParametersType
TrainingParameters

The training parameters used in the model training. Default values are in the param file. The training is deterministric. Theses values are hased to determine a training run

source
MLNanoShaper._trainMethod
_train(training_parameters::TrainingParameters, directories::AuxiliaryParameters)

train the model given TrainingParameters and AuxiliaryParameters.

source
MLNanoShaper._trainMethod
train((train_data,test_data),training_states; nb_epoch)

train the model on the data with nb_epoch

source
MLNanoShaper.categorical_lossMethod
categorical_loss(model, ps, st, (; point, atoms, d_real))

The loss function used by in training. Return the KL divergence between true probability and empirical probability Return the error with the espected distance as a metric.

source
MLNanoShaper.continus_lossMethod
continus_loss(model, ps, st, (; point, atoms, d_real))

The loss function used by in training. compare the predicted (square) distance with $\frac{1 + anh(d)}{2}$ Return the error with the espected distance as a metric.

source
MLNanoShaper.generate_dataMethod
generate_data()

generate data from the parameters files in param/ by downloading the pdb files and running Nanoshaper.

source
MLNanoShaper.generate_data_pointsMethod
generate_data_points(
    preprocessing::Lux.AbstractExplicitLayer, points::AbstractVector{<:Point3},
    (; atoms, skin)::TreeTrainingData{Float32}, (; ref_distance)::TrainingParameters)

generate the data_points for a set of positions points on one protein.

source
MLNanoShaper.implicit_surfaceMethod
implicit_surface(atoms::AnnotedKDTree{Sphere{T}, :center, Point3{T}},
    model::Lux.StatefulLuxLayer, (;
        cutoff_radius, step)) where {T}

Create a mesh form the isosurface of function `pos -> model(atoms,pos)` using marching cubes algorithm and using step size `step`.
source
MLNanoShaper.load_data_pdbMethod
load_data_pdb(T, name::String)

Load a TrainingData{T} from current directory. You should have a pdb and an off file with name name in current directory.

source
MLNanoShaper.load_data_pqrMethod
load_data_pqr(T, name::String)

Load a TrainingData{T} from current directory. You should have a pdb and an off file with name name in current directory.

source
MLNanoShaperRunner.OptionType
state

The global state manipulated by the c interface. To use, you must first load the weights using load_weights and the input atoms using load_atoms. Then you can call eval_model to get the field on a certain point.

source
MLNanoShaperRunner.ConcatenatedBatchType
ConcatenatedBatch

Represent a vector of arrays of sizes (a..., bn) where bn is the variable dimension of the batch. You can access view of individual arrays with get_slice.

source
MLNanoShaperRunner.batched_sumMethod
batched_sum(b::AbstractMatrix,nb_elements::AbstractVector)

compute the sum of a Concatenated batch with ndim = 2. The first dim is the feature dimension. The second dim is the the batch dim.

Given b of size (n,m) and nb_elements of size (k,), the output has size (n,k).

source
MLNanoShaperRunner.load_atomsFunction
load_atoms(start::Ptr{CSphere},length::Cint)::Cint

Load the atoms into the julia model. Start is a pointer to the start of the array of CSphere and length is the length of the array

Return an error status:

  • 0: OK
  • 1: data could not be read
  • 2: unknow error
source
MLNanoShaperRunner.load_modelFunction
load_model(path::String)::Cint

Load the model from a MLNanoShaperRunner.SerializedModel serialized state at absolute path path.

Return an error status:

  • 0: OK
  • 1: file not found
  • 2: file could not be deserialized properly
  • 3: unknow error
source