This tutorial will guide you in the creation of a model to hypothesize the potential host range for the marine monogenean parasite Microcotyle sebastis in the Scorpaenidae. M. sebastis has been already reported from other members of the family, namely Scorpaena neglecta, Scymnodalatias albicauda and Scymnodalatias sherwoodi; it could therefore be interesting to test which other members of the Scorpaenidae could potentially host this parasite. Prior to begin, click here to open PaNic model setup page in another tab.
1 - Setting up a simple model
The first step is to choose Microcotyle sebastis as the parasite of interest by either inserting the scientific name in the search textbox, or searching for the species in the scroll down menus (screenshot). If you misspelled the species, (or if you searched a species not in the internal database) you can search by genus. If the species is still not found in the database, you can build a custom model by entering a list of known hosts for the parasite. Microcotyle sebastis is included in the database, and can be selected from the Monogenea menu.
Once you have selected the parasite, leave all the model options set to their default values, go down to the section named "...Define the projection host set", search for Scorpaenidae in the family list and check the corresponding label (screenshot). From here on, we will refer to the projection host set simply as "PHS". Prior to running the model, check also the option "Show graphs" in the output preferences. Finally, click on the "Create your model" button and wait a few seconds for the model to be computed (screenshot).
2 - A closer look at model results
The output page shows the results of the model computed using data from 40 known hosts for Microcotyle sebastis (which is quite large for a monogene).
The known host set (KHS) appears highly homogeneous phylogenetically, with:
- 0 subspecies belonging to the same species (S);
- 36 species sharing genera with other member of the KHS (G);
- 39 species sharing family with othe members of the KHS (F).
The phylogenetic proximity, PP, is computed as (S+G+F)/3*N, where N is the number of known hosts (40). Thus, PP = (0+36+39)/(3*40) = 0.63. In contrast, the model host set has a heterogenous geographical distribution. The total number of locality records for KHS (Lt) is 169, while the number of localities where at least one member of the KHS occurs (L) is 70. The biogeographical range (BR) for the model host set is equal to L/Lt. Thus BR = 90/229 = 0.39 (screenshot).
Among members of KHS, PaNic suggests only sebastis aletianus as a potential outlier, excluding it from model computation (screenshot). Hosts are indicated as outliers by PaNic if their average Mahalanobis distance towards each other member of KHS is more than (1+(10-NR)*0.1) times the average Mahalanobis distance for the whole KHS. NR is the Niche Restriction value, which was set to the default value of 0, so that (1+(10-NR)*0.1)=2. The average Mahalanobis distance for the KHS was 2.8. S. aleutianus is considered an outlier because its average Mahalanobis distance (6.5) is larger than 2.8*2. Average Mahalanobis distances for the KHS and for the individual hosts are reported in the .txt output file, that can be downloaded clicking on the "Save" button at the bottom of output page.
S. aleutianus differs most from the other members of KHS in its life span and age at first maturity. The percentage values in brackets indicate that life span and age at first maturity of S. aleutianus are respectively bigger than 6.8 and 5.9 times the niche interval computed for M. sebastis.
No ecological variable has been excluded from computation, meaning that the KHS provides enough data to compute niche boundaries for all the selected ecological parameters. Considering the relative variation of ecological variables (rVEV, i.e. the ratio between standard deviation of a variable in the KHS and in the PHS), K and trophic level are the more consistent parameters, i.e. their values are much less variable in the model host set than they are in the projection host set. By contrast, max-length, life span and maturity age are much more variable in the KHS than they are in PHS. (screenshot).
The computed niche boundaries for the known hosts of M. sebastis are:
|Growth Rate (K)||0.05||0.10||0.32||0.84|
|Age at First Maturity||0.1||2.72||6.43||13.0|
These data are provided by PaNic in the output text file, that can be downloaded clicking on the "Save" button.
Among the members of the PHS (186 species, screenshot), the current model suggests 30 species as probable hosts for M. sebastis (screenshot). For example Pterois antennata is considered a likely compatible host (80%), because all of its ecological parameters except maximum length fall inside the optimal niche interval computed for M. sebastis.
To see this graphically, scroll down the page and look for the P. antennata graph (screenshot). The light green portion of each stripe represents the relative position of the optimal niche interval for a particular variable computed for M. sebastis with respect to the minimum and maximum values of the variable within KHS. The blue fish icon represents the relative position of P. antennata for each variable (Maximum length=20; K=0.21; Life span=13.4; Age at first maturity=3.7; Trophic level=3.6).
3 - Refining the model
To refine the model return to the main page by hitting the back arrow or clicking the "New" button at the bottom of PaNic output page. Depending on your browser, if you choose the first option, your most recent entries will be probably restored (in this case, you will find M. sebastis and the Scorpaenidae already selected).
Once back in the PaNic model setup page, modify the list of selected ecological parameters. Modifications should be on a per-model basis, taking into account all the available information for the considered parasite species. Relative variation of each ecological variable is helpful for evaluating the relevance of a variable and should be used as a guideline (variables with broad ranges, being less informative than variables with narrow ranges). We will assume here that there is no available data on the ecology of M. sebastis, and we will therefore rely on the rVEV values (screenshot).
The previous model suggests that maximum length, life span and age at first maturity are relatively inconsistent in the KHS. You could exclude these three variables from model computation, or reduce their weight. Considering the marked differences of variability for the three parameters in the KHS and in the PHS (and the consequently high rVEV value), it makes sense to exclude them by unmarking the corresponding ticks. K was less consistent than Trophic level. We can account for this by lowering its weight in respect to those of the other parameter. As a general rule, we suggest excluding variables with rVEV >= 1 and to attribute to those with rVEV<1 a weight inversely proportional to respective position of their rVEV in the interval 0-1. Thus, set K weight to 8 and trophic level weight to 9 (screenshot).
Scroll down the family list and check Scorpaenidae again.
The section "Apply one or more filter to the projection host" allows one to limit the output of compatible hosts to those most phylogenetically or biogeographically related to those of the KHS (screenshot). Again, we can use the output of the previous model (and particularly PP and BR) as a guideline to properly set up PEV and BEV.
The value of PP is quite high for the M. sebastis KHS, and this suggests taking into account phylogeny by modifying the PEV value. The effect of PEV in excluding species of the PHS from the list of probable parasites depends on the composition of KHS and PHS, and can significantly vary from one model to another. After running many examples to test PaNic, we often observed a threshold effect of PEV. Such a scenario is likely to be observed for very phylogenetically homogeneous PHS (since all their members would have similar PPI). This is the case for our PHS.
A member of the PHS will be considered likely compatible for the selected parasite only if its Phylogenetic Proximity Index (PPI ) is larger than PEV*0.01. PPI for a member of PHS is computed as the number of hosts of KHS of its same species, genus or family quoted by 3*N (where N is the total number of hosts of KHS). As an example, we can consider the first member of PHS, namely Brachypterois serrulata. Its PPI is 0.05 since:
- there are 0 hosts of the same species in the KHS;
- there are 0 hosts of the same genus in the KHS;
- there are 6 hosts of the same family in the KHS;
With N equal to 40:
Brachypterois serrulata would be therefore excluded from the list of compatible hosts
i.e. if PEV>5. Because our projection set is already constrained to be composed of hosts within one family, the minimum PPI is 0.5. This is redundant to controlling for phylogeny by changing PEV. Therefore, an increasing in PEV would not change the number of compatible hosts until you set it to 5 (at this value, only hosts of the genus Scorpaena are considered compatible). Thus, unless there is an assumption about the host specificity of M. sebastis below the family level, it is not necessary to change PEV from the default value (0).
BR for the KHS was 0.39, which means that there is a slight overlap among the geographic distributions of the KHS. Modifying the value of BEV is conceptually different from modifying that of PEV because compatible hosts for a parasite are not necessary encountered by the parasite. While PEV is useful to include in the model aspects of host specificity fundamental to define a compatibility filter, the BEV variable allows the user to set a biogeographic encounter filter. Here, leave BEV at 0, to obtain the complete list of compatible hosts with no encounter filter.
The last step is to limit the PHS to the species belonging to specific habitat/s (freshwater, brackish, saltwater). This is also an encounter filter, but it is redundant considering that all the members of the Scorpaenidae family are marine. The habitat filter is very useful for large PHS including species from different habitats (such as those obtained by selecting all the fish species from different localities). For this example leaving all the ticks on or unmarking the "Freshwater" and "Brackish" ones, leads to the same result (screenshot).
Now click the submit button and wait for a few seconds for the model to complete.
The model, produced with the exclusion of the not significant ecological variables, is more restrictive than the initial model, reducing the number of identified likely compatible species from 30 to 7 (screenshot).
4 - Tuning model parameters
The list of compatible hosts produced by the filtered model is sensibly shorter than the unfiltered model, but can be further reduced. To restrict the model, go back to model setup page (e.g., by using the browser back button) and try to modify the model parameters (screenshot).
Raising the Niche Restriction value (e.g., to 5) strengthens the model by having a broader definition of an outlier.
The Cutoff value corresponds with the width of the niche boundaries. The default value (7) is an approximation of the standard value for Bioclim algorithm. Setting it to 5 makes the niche intervals narrow and filteres the list.
The Threshold value sets the minimum percentage of weighted ecological variables required to be within the niche boundaries for a host to be considered compatible. Raising this to 85 increases the number of outliers identified by PaNic from one to five.
As a consequence of the raised threshold and with the lowered cutoff, the number of compatible hosts declines to 3 (Pterois lunulata, Scorpaena grandicornis and Scorpaena normani) (screenshot). Note that there is no general rule for tuning PaNic model parameters. The strength of PaNic is in its flexibility to let users define which filters to use set and how restrictive the filters should be to achieve a list of a particular length and composition.