-
Notifications
You must be signed in to change notification settings - Fork 3
Workflows
A pseudo-random distribution of immunogold particles is automatically generated for reference purposes using Python random.randint, which creates a random float in the range [0.0, 1.0) and is itself based on the widely used deterministic "Mersenne Twister." The program accepts the boundaries of a selected image mask and creates pseudo-random integer coordinates within the range, keeping the particles within the specified area where coordinates should be found. The implementation generates either a set of N coordinates or an equivalent count to the number of gold particles in the CSV file by default. This value can be modified in the main page for specific purpose comparisons.
The NND for all gold particles is calculated in GIO and automatically compared to the randomly generated control distribution. It is calculated by iterating through a list of coordinates, where each coordinate is an XY pairing representing the location of a gold particle within the replica images. For each coordinate pairing, GIO then subsequently iterates through the list to find the closest particle using the Euclidian formula and outputs the smallest value. GIO then outputs a CSV listing the NND for all gold particles, a histogram, and a visual output of the NND as gradient-color-coded lines between particles overlayed on the original replica image. Histogram bin size can be modified by the user.
Cluster analysis of gold particles is performed with the Ward Hierarchical method, which minimizes the variance of the clusters being merged (Ward Jr., 1963). GIO finds cluster states of the gold particles and sorts those into groups using a form of agglomerative clustering within the scikit-learn python package (Pedregosa et al 2011). Euclidian distance was used to calculate the affinity or metric. GIO is capable of using two methods to mathematically calculate clusters: 1) placing a pre-set number of clusters in the region of interest, and 2) creating clusters with a given distance threshold. By default, and for most cases, the distance threshold method is used for replica image analysis. GIO creates a circle of set radius length centered on each gold particle, then sorts particles into groups (i.e. clusters) based on whether their circles overlap. Clusters are then assigned identification numbers and the area of each cluster is calculated as the union of all overlapping circles. The result summarized in a CSV file includes cluster identification number, number of gold particles in the cluster, and the cluster area. The color overlay of the clusters can beis set to displayed on the replica image by choosing “true” in a line of “draw clust area”..
GIO is able to measure the separation of clusters—i.e., the distance from an individual cluster to its nearest cluster (Altof et al., 2015). It determines the centroid of each cluster by finding the average of both X and Y values of all points in the cluster. For this analysis, clusters are defined as containing more than three gold particles. This criterion can be modified in the menu page for the purpose. GIO then performs the same NND calculation as described in the section 2.4.2 on all clusters that meet the set criteria.
Gold Rippler is a unique analysis method in the package to analyze the relative distance of gold particles to their closest “landmark” based on coordinates uploaded from a second CSV file. The result shows if the distribution of gold particles tends to be closer to or far from the landmarks. The “ripple” grows from the centroid of each coordinate pair in a given lighthouse population by a user-defined step size, by default 60px, per a set number of steps, by default 10, until the ripple covers X gold particles and then calculating the “Landmark Correlated Particle Index (LCPI)", which is the percentage of gold particles within the mask divided by the percentage of total masked area of interest. LCPI = (% of gold particles within landmark masks) / (% of total area of ROI mask taken up by spine masks). When gold particles are evenly distributed, LCPI would be “1” as the ratio of the gold particle in the ripple should be same as the area ratio of the ripples.
Starfish nearest neighbour distance is used to find the nearest neighbour distance between the primary dataset and a separate lighthouse population. We call it starfish due to the distinctive shape of the generated output data. TBD. Starfish NND is found by iterating through a list of given coordinate pairs and then finding the closest particle of a given lighthouse population to it by finding the smallest Euclidian distance. Starfish allows for making conclusions comparing different populations using quantitative distances, versus a more general correlation. However, Starfish may not always generate output that is biologically sound.
A* is a unique analysis method which utilizes the imported mask to find a biological path from a particle to a landmark. The A* search algorithm uses a variable "g" to represent the distance from the current node to the start node, and "h" to represent the estimated distance from the current node to the end node. By adding g and h, the algorithm creates a weighted grid with the map provided, and can use these weights to find the shortest distance between two points. To account for the intense runtimes of this workflow, the A* workflow runs Goldstar first and uses the distances generated to create an ordered queue for the estimated shortest values, as well as downscaling the map. Because a lot of landmark locations happen to be inside holes in the mask that A* cannot navigate, a workaround was developed to get around this. The algorithm runs first trying to navigate from the landmark inside the hole to the particle, however once it touches the end of the hole it saves that location as a point and kills the current pathfinding. Then, a second A* iteration is run from the gold particle to that point on the regular mask. Then once these two points meet, the lines are combined to get a full line from the particle to the landmark.