Single Structures¶
The main data structure in BiobOx is the Structure
class, which handles collections of 3D points.
Points are stored in a MxNx3 coordinates array, where M is the number of alternative points arrangements, and N is the amount of points.
At any moment, one of the loaded points conformations is considered to be the active one (a.k.a. current).
Any rototranslation or measuring operation will be performed on the current structure only.
Some methods, e.g. rmsd
allow comparing different conformations, independently from which is the current one.
The current conformation in the coordinates array can be changed by calling the set_current
method (altering the Structure’s current attribute).
For comfort, the current conformation is accessible in the points Nx3 array, where:
>>> self.points = self.coordinates[self.current]
Points attributes are stored in a pandas dataframe. For instance, self.data[“radius”] contains each point’s radius. A property of the whole point cloud can be stored in a properties dictionary (e.g. self.properties[“center”]).
Several Structure
subclasses are available (Molecule
, Ellipsoid
, Cylinder
, Cone
, Sphere
, Prism
, Density
, see below).
Structure¶
- class Structure(p=array([], shape=(2, 0), dtype=float64), r=1.0)[source]¶
A Structure consists of an ensemble of points in 3D space, and metadata associated to each of them.
Point coordinates and properties data structures are first initialized. properties is a dictionary initially containing an entry for ‘center’ (center of geometry) and ‘radius’ (average radius of points).
- Parameters
p – coordinates data structure as a mxnx3 numpy array (alternative conformation x atom x 3D coordinate). nx3 numpy array can be supplied, in case a single conformation is present.
r – average radius of every point in dataset (float), or radius of every point (numpy array)
- add_xyz(coords)[source]¶
add a new alternative conformation to the database
- Parameters
coords – array of 3D points, or array of arrays of 3D points (in case multiple alternative coordinates must be added at the same time)
- align_axes()[source]¶
Align structure on its principal axes.
First principal axis aligned along x, second along y and third along z.
- apply_transformation(M)[source]¶
apply a 3x3 transformation matrix
- Parameters
M – 3x3 transformation matrix (2D numpy array)
- convex_hull()[source]¶
compute Structure’s convex Hull using QuickHull algorithm.
Note
Qhull available only on scipy >=0.12
- Returns
Structure
object, containing the coordinates of vertices composing the convex hull
- coordinates¶
numpy array containing an ensemble of alternative coordinates in 3D space
- current¶
index of currently selected conformation
- data¶
metadata about each atom (pandas Dataframe)
- delete_xyz(index)[source]¶
remove one conformation from the conformations database.
the new current conformation will be the previous one.
- Parameters
index – alternative coordinates set to remove
- get_center()[source]¶
compute protein center of geometry (also assigns it to self.properties[“center”] key).
- get_density(step=1.0, sigma=1.0, kernel_half_width=5, buff=3)[source]¶
generate density map from points
- Parameters
step – size of cubic voxels, in Angstrom
sigma – gaussian kernel sigma
kernel_half_width – kernel half width, in voxels
buff – padding to add at points cloud boundaries
- Returns
Density
object, containing a simulated density map
- get_principal_axes()[source]¶
compute Structure’s principal axes.
- Returns
3x3 numpy array, containing the 3 principal axes ranked from smallest to biggest.
- get_xyz(indices=[])[source]¶
get points coordinates.
- Parameters
indices – indices of points to select. If none is provided, all points coordinates are returned.
- Returns
coordinates of all points indexed by the provided indices list, or all of them if no list is provided.
- pca(components, indices=[])[source]¶
compute Principal Components Analysis (PCA) on specific points within all the alternative coordinates.
- Parameters
components – eigenspace dimensions
indices – points indices to be considered for PCA
- Returns
numpy array of projection of each conformation into the n-dimensional eigenspace
- Returns
sklearn PCA object
- points¶
pointer to currently selected conformation
- properties¶
collection of properties. By default, ‘center’ (geometric center of the Structure) is defined
- rmsd(i, j, points_index=[], full=False)[source]¶
Calculate the RMSD between two structures in alternative coordinates ensemble. uses Kabsch alignement algorithm.
- Parameters
i – index of the first structure
j – index of the second structure
points_index – if set, only specific points will be considered for comparison
full – if True, RMSD an rotation matrx are returned, RMSD only otherwise
- Returns
RMSD of the two structures. If full is True, the rotation matrix is also returned
- rmsd_distance_matrix(points_index=[], flat=False)[source]¶
compute distance matrix between structures (using RMSD as metric).
- Parameters
points_index – if set, only specific points will be considered for comparison
flat – if True, returns flattened distance matrix
- Returns
RMSD distance matrix
- rmsd_one_vs_all(ref_index, points_index=[], align=False)[source]¶
Calculate the RMSD between all structures with respect of a reference structure. uses Kabsch alignement algorithm.
- Parameters
ref_index – index of reference structure in conformations database
points_index – if set, only specific points will be considered for comparison
align – if set to true, all conformations will be aligned to reference (note: cannot be undone!)
- Returns
RMSD of all structures with respect of reference structure (in a numpy array)
- rmsf(indices=- 1, step=1)[source]¶
compute Root Mean Square Fluctuation (RMSF) of selected atoms.
- Parameters
indices – indices of points for which RMSF will be calculated. If no indices list is provided, RMSF of all points will be calculated.
step – timestep between two conformations (useful when using conformations extracted from molecular dynamics)
- Returns
numpy aray with RMSF of all provided indices, in the same order
- rotate(x, y, z)[source]¶
rotate the structure provided angles of rotation around x, y and z axes (in degrees).
This is a rotation with respect of the origin. Make sure that the center of your structure is at the origin, if you don’t want to get a translation as well! rotating an object being not centered requires to first translate the ellipsoid at the origin, rotate it, and bringing it back.
- Parameters
x – rotation around x axis
y – rotation around y axis
z – rotation around z axis
- rotation_matrix(axis, theta)[source]¶
compute matrix needed to rotate the system around an arbitrary axis (using Euler-Rodrigues formula).
- Parameters
axis – 3d vector (numpy array), representing the axis around which to rotate
theta – desired rotation angle
- Returns
3x3 rotation matrix
- set_current(pos)[source]¶
select current frame (place frame pointer at desired position)
- Parameters
pos – number of alternative conformation (starting from 0)
Molecule¶
- class Molecule[source]¶
Bases:
biobox.classes.structure.Structure
Subclass of
Structure
, allows reading, manipulating and analyzing molecular structures.At instantiation, properties associated to every individual atoms are stored in a pandas Dataframe self.data. The columns of the self.data have the following names: atom, index, name, resname, chain, resid, beta, occupancy, atomtype, radius, charge.
self.knowledge contains a knowledge base about atoms and residues properties. Default values are:
‘residue_mass’ property stores the average mass for most common aminoacids (values from Expasy website)
‘atom_vdw’ vdw radius of common atoms
‘atom_mass’ mass of common atoms
The knowledge base can be edited. For instance, to add information about residue “TST” mass in molecule M type: M.knowledge[‘mass_residue’][“TST”]=142.42
- apply_biomatrix()[source]¶
if biomatrix information is provided, generate a new molecule with all symmetry operators applied.
- Returns
new Molecule containing several copies of the current Molecule, arranged according to BIOMT statements contained in pdb, or -1 if no transformation matrix is provided
- apply_symmetry()[source]¶
if symmetry information is provided, generate a new molecule with all symmetry operators applied.
- Returns
new Molecule containing several copies of the current Molecule, arranged according to SMTRY statements contained in pdb, or -1 if no transformation matrix is provided
- atomignore(chain, res, atom, get_index=False, use_resname=False)[source]¶
Select specific atoms that do not match a specific query (chain, residue ID and atom name). Useful to remove from a molecule atoms unwanted for further analysis, alternative conformations, etc…
- Parameters
chain – chain name (accepts * as wildcard). Can also be a list or numpy array of strings.
res – residue ID (accepts * as wildcard). Can also be a list or numpy array of of int.
atom – atom name (accepts * as wildcard). Can also be a list or numpy array of strings.
get_index – if set to True, returns the indices of atoms in self.points array (and self.data)
use_resname – if set to True, consider information in “res” variable as resnames, and not resids
- Returns
coordinates of the selected points not matching the query, if get_index is set to true, their indices in self.points array.
- atomselect(chain, res, atom, get_index=False, use_resname=False)[source]¶
Select specific atoms in the protein providing chain, residue ID and atom name.
- Parameters
chain – selection of a specific chain name (accepts * as wildcard). Can also be a list or numpy array of strings.
res – residue ID of desired atoms (accepts * as wildcard). Can also be a list or numpy array of of int.
atom – name of desired atom (accepts * as wildcard). Can also be a list or numpy array of strings.
get_index – if set to True, returns the indices of selected atoms in self.points array (and self.data)
use_resname – if set to True, consider information in “res” variable as resnames, and not resids
- Returns
coordinates of the selected points and, if get_index is set to true, their indices in self.points array.
- beta_factor_from_rmsf(indices=- 1, step=1)[source]¶
estimate atoms beta factor on the base of their RMSF.
- Parameters
indices – indices of atoms of interest. If not set all atoms will be considered.
step – timestep between two conformations (useful when using conformations extracted from molecular dynamics)
- get_atoms_ccs()[source]¶
return array with atomic CCS radii of every atom in molecule
- Returns
CCS in Angstrom^2
- get_couples(idx, cutoff)[source]¶
given a list of indices, compute the all-vs-all distance and return only couples below a given cutoff distance
useful for the detection of disulfide bridges or linkable sites via cross-linking (approximation, supposing euclidean distances)’
- Parameters
idx – indices of atoms to check.
cutoff – minimal distance to consider a couple as linkable.
- Returns
nx3 numpy array containing, for every valid connection, id of first atom, id of second atom and distance between the two.
- get_data(indices=[], columns=[])[source]¶
Return information about atoms of interest (i.e., slice the data DataFrame)
- Parameters
indices – list of indices, if not provided all atom data is returned
columns – list of columns (e.g. [“resname”, “resid”, “chain”]), if not provided all columns are returned
- Returns
numpy array containing a slice of molecule’s data
- get_dipole_density(dipole_map, orig, min_val, V, outname, vox_in_window=3.0, eqn='gauss', T=310.15, P=101000.0, epsilonE=54.0, resolution=1.0)[source]¶
Method to generate an electron density map based on a voxel grid of dipole vectors
- Parameters
dipole_map – The dipole map input. Can be generated with get_dipole_map above
orig – Origin points for voxel grid
min_val – Minimum coordinates of edge points for the voxel grid (i.e. a single x, y, z point defining the start point of the grid to match with the multipdb)
V – Volume of a voxel (can be found by resolution**3, but left blank in case later version institute a sphere)
outname – Name of electron density map file produced
vox_in_window – Amount of surrounding space to contribute to local dipole. vox_in_window * resolution gives window size (in Ang.)
eqn – Equation mode to model the electron density
T – Temperature of MD
P – Pressure of MD
epsilonE – Continuum dielectric surrounding the protein
resolution – Desired resolution of voxel
- get_dipole_map(orig, pqr, time_start=0, time_end=2, resolution=1.0, vox_in_window=3.0, write_dipole_map=True, fname='dipole_map.tcl')[source]¶
Method for generating dipole maps to be used for electron density map generation. Also prints a dipole map as a result (and if desired). It calls a cython code in lib.
- Parameters
orig – Origin points for voxel grid
pqr – PQR file for self. Can be generated by calling pdb2pqr above
time_start – First frame to parse in multipdb
time_end – Last frame to parse in multipdb
resolution – Desired resolution of voxel
vox_in_window – Amount of surrounding space to contribute to local dipole. vox_in_window * resolution gives window size (in Ang.)
write_dipole_map – Write a dipole map in TCL format to be read in via VMD.
fname – Name of desired dipole map to be written
- get_electrostatics(step=1.0, buff=3, threshold=0.01, vdw_kernel_half_width=5, elect_kernel_half_width=12, chain='*', clear_mass=True)[source]¶
generate electrostatics map from points
- Parameters
step – size of cubic voxels, in Angstrom
buff – padding to add at points cloud boundaries
threshold – Threshold used for removing mass occupied space from the electron density map
vdw_kernel_half_width – kernel half width, in voxels
elect_kernel_half_width – kernel half width, in voxels
chain – select chain to use, default all chains
- Returns
positive
Density
object- Returns
negative
Density
object- Returns
mass density object
- get_mass_by_atom(skip_resname=[])[source]¶
compute protein mass using atoms in pdb
sum the mass of all atoms (using a knowledge base of atom masses in Dalton) The knowledge base can be expanded or edited by adding or editing entries to the molecule’s mass dictionary, e.g. to add the atom “PI” mass in molecule M type: M.knowledge[‘atom_mass’][“PI”]=3.141592
- Parameters
skip_resname – list of resnames to skip. Useful to exclude ions water or other ligands from the calculation.
- Returns
mass of molecule in Dalton
- get_mass_by_residue(skip_resname=[])[source]¶
Compute protein mass using residues (i.e. account also for atoms not present in the structure)
Sum the average mass all every residue (using a knowledge base of atom masses in Dalton) The knowledge base can be expanded or edited by adding entries to the molecule’s mass dictionary, e.g. to add the residue “TST” mass in molecule M type: M.knowledge[‘mass_residue’][“TST”]=142.42
- Parameters
skip_resname – list of resnames to skip. Useful to exclude ions water or other ligands from the calculation.
- Returns
mass of molecule in Dalton
- get_pdb_data(index=[])[source]¶
aggregate data and point coordinates, and return in a unique data structure
Returned data is a list containing strings for points data and floats for point coordinates in the same order as a pdb file, i.e. ATOM/HETATM, index, name, resname, chain name, residue ID, x, y, z, occupancy, beta factor, atomtype.
- Returns
list aggregated data and coordinates for every point, as string.
- get_secondary_structure(dssp_path='')[source]¶
compute the protein’s secondary structure, calling DSSP :param dssp_path: DSSP executable (path and filename). If not provided, the default behaviour is to seek for this information in the environment variable DSSPPATH :returns: numpy array of characters, with one-letter-coded secondary sctructure according to DSSP.
- get_subset(idxs, conformations=[])[source]¶
Return a
Molecule
object containing only the selected atoms and frames- Parameters
ixds – atoms to extract
conformations – frames to extract (by default, all)
- Returns
Molecule
object
- get_vdw_density(buff=3, step=0.5, kernel_half_width=10)[source]¶
generate density map from points based on the van der Waals readius of the atoms
- Parameters
buff – Buffer used to create the boundaries of the density map
step – Stepsize for creating the density object
kernel_half_width – Kernel half width of the gaussian kernel, will be scaled by atom specific sigma
- Returns
Density
object, containing a density map
- guess_chain_split(distance=3, use_backbone=True)[source]¶
reassign chain name, using distance cutoff (cannot be undone). If two consecutive atoms are beyond a cutoff, a new chain is assigned.
- Parameters
distance – distance cutoff distanceR: no atomtype found!
use_backbone – if True, splitting will be performed considering backbone atoms (N and C), all atoms in a sequence otherwise
- import_gro(filename)[source]¶
read a gro possibly containing multiple structures.
- Parameters
filename – name of .gro file to import
- import_pdb(pdb, include_hetatm=False)[source]¶
read a pdb (possibly containing containing multiple models).
Models are split according to ENDMDL and END statement. All alternative coordinates are expected to have the same atoms. After loading, the first model (M.current_model=0) will be set as active.
- Parameters
pdb – PDB filename
include_hetatm – if True, HETATM will be included (they get skipped if False)
- import_pqr(pqr, include_hetatm=False)[source]¶
Read a pqr (possibly containing containing multiple models).
models are split according to ENDMDL and END statement. All alternative coordinates are expected to have the same atoms. After loading, the first model (M.current_model=0) will be set as active.
- Parameters
pqr – PQR filename
include_hetatm – if True, HETATM will be included (they get skipped if False)
- know(prop)[source]¶
return information from knowledge base
- Parameters
prop – desired property to extract from knowledge base
- Returns
value associated to requested property, or nan if failed
- match_residue(M2, sec=5)[source]¶
Compares two bb.Molecule() peptide strands and returns the resids within both peptides when the two are homogenous beyond a certain secondary structure threashold. The default is 5 amino acids (given by sec) in a row must be identical
Useful when aligning PDB structures that have been crystallised separately - so one may be missing the odd residue or have a few extra at the end.
- Parameters
M2 – The second bb.Molecule() to compare with
sec – Number of consecutive amino acids in a row that must match before resid’s are recorded
- pdb2pqr(ff='')[source]¶
Parses data from the pdb input into a pqr format. This uses the panda dataframe with the information regarding atom indexes, types etc. in the self.data files. It outputs a panda dataframe with the pqr equivilent information. It requires a datafile forcefield input. The default is the amber14sb forcefield file held within the classes/ folder.
- Parameters
ff – name of forcefield text file input that needs to be read to read charges / vdw radii.
- query(query_text, get_index=False)[source]¶
Select specific atoms in a multimer un the basis of a text query.
- Parameters
query_text – string selecting atoms of interest. Uses the pandas query syntax, can access all columns in the dataframe self.data.
get_index – if set to True, returns the indices of selected atoms in self.points array (and self.data)
- Returns
coordinates of the selected points (in a unique array) and, if get_index is set to true, a list of their indices in subunits’ self.points array.
- rmsf_from_beta_factor(indices=[])[source]¶
calculate RMSF from atoms beta factors.
- Parameters
indices – indices of atoms of interest. If not set all atoms will be considered.
- s2(atomname1='N', atomname2='H')[source]¶
compute s2, given two atoms defining the vector of interest.
- Parameters
atomname1 – name of the first atom
atomname2 – name of the second atom
- Returns
data numpy array containing information about residues for which measuring has been performed (i.e.[chain, resid])
- Returns
s2 s2 of residues for which both provided input atoms have been found
- same_residue(index, get_index=False)[source]¶
Select atoms having the same residue and chain as a given atom (or list of atoms)
- Parameters
indices (index) – of atoms of choice (integer of list of integers)
get_index – if set to True, returns the indices of selected atoms in self.points array (and self.data)
- Returns
coordinates of the selected points and, if get_index is set to true, their indices in self.points array.
- same_residue_unique(index, get_index=False)[source]¶
Select atoms having the same residue and chain as a given atom (or list of atoms)
- Parameters
index – indices of atoms of choice (integer of list of integers)
get_index – if set to True, returns the indices of selected atoms in self.points array (and self.data)
- Returns
coordinates of the selected points and, if get_index is set to true, their indices in self.points array.
- set_data(value, indices=[], columns=[])[source]¶
Return information about atoms of interest (i.e., slice the data DataFrame)
- Parameters
indices – list of indices, if not provided all atom data is returned
columns – list of columns (e.g. [“resname”, “resid”, “chain”]), if not provided all columns are returned
- Returns
numpy array containing a slice of molecule’s data
- write_gro(outname, conformations=[], index='')[source]¶
write structure(s) in .gro format.
- Parameters
outname – name of .gro file to be generated.
index – indices of atoms to write to file. If empty, all atoms are returned. Index values obtaineable with a call like: index=molecule.atomselect(“A”, [1, 2, 3], “CA”, True)[1]
conformations – list of conformation indices to write to file. By default, all conformations will be returned.
- write_pdb(outname, conformations=[], index=[])[source]¶
overload superclass method for writing (multi)pdb.
- Parameters
outname – name of pdb file to be generated.
index – indices of atoms to write to file. If empty, all atoms are returned. Index values obtaineable with a call like: index=molecule.atomselect(“A”, [1, 2, 3], “CA”, True)[1]
conformations – list of conformation indices to write to file. By default, a multipdb with all conformations will be produced.
- write_pqr(outname, conformations=[], index=[])[source]¶
overload superclass method for writing (multi)pqr.
- Parameters
outname – name of pqr file to be generated.
index – indices of atoms to write to file. If empty, all atoms are returned. Index values obtaineable with a call like: index=molecule.atomselect(“A”, [1, 2, 3], “CA”, True)[1]
conformations – list of conformation indices to write to file. By default, a multipdb with all conformations will be produced.
Convex Point Clouds¶
The following classes allow generating clouds of points arranged according to specific (convex) geometries.
All these classes are subclass of Structure
.
- class Cone(r, h, skew=0, radius=1.1, pts_density_r=0.09817477042468103, pts_density_h=0.2)[source]¶
Bases:
biobox.classes.structure.Structure
Create an ensemble of points arranged as a cone.
- Parameters
r – radius
h – height
skew – skewing with respect of vertical axis
radius – size of the individual points composing it
pts_density_r – density of points along the rotation axis (using parametric function for cylinder)
pts_density_h – density of points along height (using parametric function for cylinder)
- class Cylinder(r, h, squeeze=1.0, skew=0.0, radius=1.1, pts_density_u=0.09817477042468103, pts_density_h=0.2)[source]¶
Bases:
biobox.classes.structure.Structure
Create an ensemble of points arranged as an elliptical cylinder.
- Parameters
r – radius
h – height
squeeze – create an elliptical base, having axes equal to r and squeeze*r
skew – skewing with respect of vertical axis
radius – size of the individual points composing it
density (pts_density_h) – of points along the u angle (using parametric function for cylinder)
density – of points along the v angle (using parametric function for cylinder)
- ccs(gas=1)[source]¶
Compute cylinder CCS.
Uses Ramanujan approximation for base perimeter. Good, but not perfect for very elliptical cylinders!
- Returns
CCS in A^2
- class Ellipsoid(a, b, c, radius=1.9, pts_density_u=0.08726646259971647, pts_density_v=0.08726646259971647)[source]¶
Bases:
biobox.classes.structure.Structure
Create an ensemble of points arranged as an ellipsoid.
- Parameters
a – x radius of the ellipsoid
b – y radius of the ellipsoid
c – z radius of the ellipsoid
radius – size of the individual points composing it
pts_density_u – This parameter defines the density of points along the u angle (using parametric function for ellipsoid)
pts_density_v – This parameter defines the density of points along the v angle (using parametric function for ellipsoid)
- ccs(gas=1)[source]¶
compute ellipsoid CCS.
Uses analytical approximation to surface area.
- Returns
surface in A^2
- check_inclusion(p)[source]¶
count how many points in the array p are included in the ellipsoid.
overloading of superclass function, which is slower (here we can use the ellipsoid functional form to speed up things)
- Parameters
p – list of points (numpy array)
- Returns
quantity of points located inside the ellipsoid
- class Prism(r, h, n, skew=0.0, radius=1.1, pts_density_u=0.09817477042468103, pts_density_h=0.2)[source]¶
Bases:
biobox.classes.structure.Structure
Create an ensemble of points arranged as a prism (polygonal bottom and top, rectangular sides).
- Parameters
r – distance of sides from center of symmetry
h – height
n – number of side faces
skew – skewing with respect of vertical axis
radius – size of the individual points composing it
pts_density_u – density of points along the r axis (polar coordinates)
pts_density_h – the density of points along the vertical axis
- class Sphere(r, radius=1.9, n_sphere_point=960)[source]¶
Bases:
biobox.classes.structure.Structure
Create an ensemble of points arranged as a sphere.
using golden spiral to approximate an even distribution
- Parameters
r – radius of the ellipsoid
radius – size of the individual points composing it
n_sphere_point – This parameter defines the amount of points in the sphere
- ccs(gas=1)[source]¶
compute spheroid CCS.
Uses analytical approximation to surface area.
- Returns
surface in A^2
- check_inclusion(p)[source]¶
count how many points in the array p are included in the sphere.
overloading of superclass function, which is slower (here we can use the ellipsoid functional form to speed up things)
- Parameters
p – list of points (numpy array)
- Returns
quantity of points located inside the sphere
- get_sphericity()[source]¶
compute sphericity (makes sense only for squeezed spheres, obviously..)
- Returns
shape sphericity
- squeeze(deformation, preserve_volume=True)[source]¶
squeeze sphere according to deformation coefficient(s)
- Parameters
deformation – deformation coefficient. Can be a float (deformation of one axis), or a list of 2 or 3 floats.
preserve_volume – If true and deformation is either a float or a list of 2 floats, correct remaining axes to preserve volume.
Density¶
- class Density[source]¶
Bases:
biobox.classes.structure.Structure
Subclass of
Structure
, allows importing density map, and transform them in a PDB file containing a collection of spheres placed on the map’s high density regions.A density map is fully described by the following attributes, stored in the self.properties dictionary:
- Parameters
density – density map
delta – scaling factor for voxels (default is [1, 1, 1] Angstrom)
size – dimensions in voxels
origin – bottom-left-front corner of the cube
radius – radius of points composing the density map
format – format name (only dx supported at the moment)
- best_threshold(mass, density=0.782878356)[source]¶
If mass and density of object are known, try to filter the map so that the mass is best matched.
search for best threshold using bissection method.
Note
in proteins, an average value of 1.3 g/cm^3 (0.782878356 Da/A^3) can be assumed. Alternatively, the relation density=1.410+0.145*exp(-mass(kDa)/13) can be used.
Note
1 Da/A^3=1.660538946 g/cm^3
Note
1 g/cm^3=0.602214120 Da/A^3
- Parameters
mass – target mass in Da
density – target density in Da/A^3
- Returns
array reporting tested values and error on mass ([sigma, model_mass-target_mass])
- blur(dimension=5, sigma=0.5)[source]¶
blur density applying a cubic gaussian kernel of given kernel dimension (cubic grid size).
Warning
cannot be undone
- Parameters
dimension – size of the kernel grid.
sigma – standard deviation of gaussian kernel.
- export_as_pdb(outname, step, threshold=0.1)[source]¶
Write a pdb file with points where the density exceeds a threshold
- Parameters
outname – output file name
step – stepsize used to generate the density map
threshold – density to be exceeded to generate a point in pdb
- find_data_from_ccs(ccs)[source]¶
map experimental data to given ccs
- Parameters
ccs – target CCS (in A^2)
- find_data_from_sigma(sigma, exact=True, append=False, noise_filter=0.01)[source]¶
map experimental data to given threshold
- Parameters
sigma – density threshold
noise_filter – launch DBSCAN clustering algorithm to detect connected regions in density map. Regions representing less than noise_filter of the total will be removed. This is a ratio, value should be between 0 and 1.
- get_oversampled_points(sigma=0)[source]¶
return points obtained by oversampling the map (doule points on every axis)
- Parameters
sigma – place points only on voxels having intensity greater than threshold
- Returns
points 3D points placed on voxels having value higher than threshold
- Returns
radius radius of produced points
- get_sigma_from_thresh(t)[source]¶
convert cutoff value from actual threshold to sigma multiple
:param threshold value :returns sigma multiple
- get_thresh_from_sigma(val)[source]¶
convert cutoff value from sigma multiples into actual threshold
- Parameters
val – sigma scaling
- Returns
cutoff
- get_volume()[source]¶
compute density map volume. This is done by counting the points, and multiplying that by voxels’ volume.
Warning
can be called only after
place_points
has been called.Warning
supposes unskewed voxels.
- import_map(filename, fileformat='dx')[source]¶
Import density map and fill up the points and properties data structures.
- Parameters
filename – name of density file to load
fileformat – at the moment supports dx, ccp4, mrc and imod
- import_numpy(data, origin=[0, 0, 0], delta=array([[1., 0., 0.], [0., 1., 0.], [0., 0., 1.]]))[source]¶
import a numpy 3D array to allow manipulation as a density map
- Parameters
data – numpy 3D array
origin – coordinates of bottom left corner of the map
delta – voxels’ shape (default is a cubic voxel of 1 Angstrom-long sides).
- place_points(sigma=0, noise_filter=0.01)[source]¶
given density information, place points on every voxel above a given threshold.
- Parameters
sigma – intensity threshold value.
noise_filter – launch DBSCAN clustering algorithm to detect connected regions in density map. Regions representing less than noise_filter of the total will be removed. This is a ratio, value should be between 0 and 1.
- scan_threshold(mass, density=0.782878356, sampling_points=1000)[source]¶
if mass and density of object are known, filter the map on a linear scale of threshold values, and compare the obtained mass to the experimental one.
Note
in proteins, an average value of 1.3 g/cm^3 (0.782878356 Da/A^3) can be assumed. Alternatively, the relation density=1.410+0.145*exp(-mass(kDa)/13) can be used.
Note
1 Da/A^3=0.602214120 g/cm^3
- Parameters
mass – target mass in Da
density – target density in Da/A^3
sampling_points – number of measures to perform between min and max intensity in density map
- Returns
array reporting tested values and error on mass ([threshold, model_mass-target_mass])
- threshold_vol_ccs(low='', high='', sampling_points=1000, append=False, noise_filter=0.01)[source]¶
return the volume to threshold to CCS relationship
- Parameters
sampling_points – number of measures to perform between min and max intensity in density map
- Returns
array reporting tested values and error on mass ([threshold, model_mass-target_mass])