Calculating biotite formula from electron microprobe analysis data using a machine learning method based on principal components regression

Research output: Contribution to journalArticleResearchpeer review

Authors

Research Organisations

External Research Organisations

  • Northwest University China
View graph of relations

Details

Original languageEnglish
Article number105371
JournalLithos
Volume356-357
Early online date10 Jan 2020
Publication statusPublished - Mar 2020

Abstract

We present a new machine learning method for calculating biotite (sensu lato) structural formula from electron microprobe analysis (EMPA) data, which is based on principal components regression (PCR) of a dataset consisting of 155 fully analyzed biotite references that have chemistry and crystal structure refinement. The dataset is randomly grouped into a training set (75% in amount) and a test set (25% in amount). The training set is used to implement the structural formula and the test set is used to evaluate the performance of the model. The resulting linear regression coefficient matrix is then applied to calculate mole proportions of cations and anions of biotite samples using their compositional data from EMPA. Through this method, the distribution of the different cations and anions in the different sites can be calculated, including the tetrahedral Fe3+, octahedral Fe2+, octahedral Fe3+, OH and WO2− at the O(4) site. The O(4) site is assumed to be occupied by anions with a relation of 2 = F + Cl + OH + WO2−. Octahedral and interlayer vacancies could also be estimated in this model. The prediction quality for major elements is perfect with R2 > 0.95. The absolute errors in the estimated octahedral Fe2+, octahedral Al and OH at O(4) site are determined to be ±0.2 apfu (atom per formula unit based on 11O + 2(F, Cl, OH, O)), while those in total Fe3+ and WO2− at O(4) site are approximately ±0.3 apfu. A funnel-shaped relationship between absolute error in Fe3+/ΣFe ratio and FeOT wt% is observed, with the majority falling in the range of ±20%. Compared to previous normalization schemes, our model shows significant improvements in estimating Fe3+/ΣFe and WO2− at O(4) site. Our model is capable for calculating mineral formulae of common igneous and hydrothermal biotites, but not suitable for those that have been modified in a post-formation oxidation or reduction process. A supplementary Excel spreadsheet is provided that can be easily used for performing calculation from EMPA data.

Keywords

    Biotite, Mineral formula, Oxidation state, Principal component regression, Ti-oxy substitution

ASJC Scopus subject areas

Cite this

Calculating biotite formula from electron microprobe analysis data using a machine learning method based on principal components regression. / Li, Xiaoyan; Zhang, Chao; Behrens, Harald et al.
In: Lithos, Vol. 356-357, 105371, 03.2020.

Research output: Contribution to journalArticleResearchpeer review

Li X, Zhang C, Behrens H, Holtz F. Calculating biotite formula from electron microprobe analysis data using a machine learning method based on principal components regression. Lithos. 2020 Mar;356-357:105371. Epub 2020 Jan 10. doi: 10.1016/j.lithos.2020.105371, 10.1016/j.lithos.2020.105506
Download
@article{0c12c1bf5a1d4713b8b2cce61be226bd,
title = "Calculating biotite formula from electron microprobe analysis data using a machine learning method based on principal components regression",
abstract = "We present a new machine learning method for calculating biotite (sensu lato) structural formula from electron microprobe analysis (EMPA) data, which is based on principal components regression (PCR) of a dataset consisting of 155 fully analyzed biotite references that have chemistry and crystal structure refinement. The dataset is randomly grouped into a training set (75% in amount) and a test set (25% in amount). The training set is used to implement the structural formula and the test set is used to evaluate the performance of the model. The resulting linear regression coefficient matrix is then applied to calculate mole proportions of cations and anions of biotite samples using their compositional data from EMPA. Through this method, the distribution of the different cations and anions in the different sites can be calculated, including the tetrahedral Fe3+, octahedral Fe2+, octahedral Fe3+, OH and WO2− at the O(4) site. The O(4) site is assumed to be occupied by anions with a relation of 2 = F + Cl + OH + WO2−. Octahedral and interlayer vacancies could also be estimated in this model. The prediction quality for major elements is perfect with R2 > 0.95. The absolute errors in the estimated octahedral Fe2+, octahedral Al and OH at O(4) site are determined to be ±0.2 apfu (atom per formula unit based on 11O + 2(F, Cl, OH, O)), while those in total Fe3+ and WO2− at O(4) site are approximately ±0.3 apfu. A funnel-shaped relationship between absolute error in Fe3+/ΣFe ratio and FeOT wt% is observed, with the majority falling in the range of ±20%. Compared to previous normalization schemes, our model shows significant improvements in estimating Fe3+/ΣFe and WO2− at O(4) site. Our model is capable for calculating mineral formulae of common igneous and hydrothermal biotites, but not suitable for those that have been modified in a post-formation oxidation or reduction process. A supplementary Excel spreadsheet is provided that can be easily used for performing calculation from EMPA data.",
keywords = "Biotite, Mineral formula, Oxidation state, Principal component regression, Ti-oxy substitution",
author = "Xiaoyan Li and Chao Zhang and Harald Behrens and Francois Holtz",
note = "Publisher Copyright: {\textcopyright} 2020",
year = "2020",
month = mar,
doi = "10.1016/j.lithos.2020.105371",
language = "English",
volume = "356-357",
journal = "Lithos",
issn = "0024-4937",
publisher = "Elsevier",

}

Download

TY - JOUR

T1 - Calculating biotite formula from electron microprobe analysis data using a machine learning method based on principal components regression

AU - Li, Xiaoyan

AU - Zhang, Chao

AU - Behrens, Harald

AU - Holtz, Francois

N1 - Publisher Copyright: © 2020

PY - 2020/3

Y1 - 2020/3

N2 - We present a new machine learning method for calculating biotite (sensu lato) structural formula from electron microprobe analysis (EMPA) data, which is based on principal components regression (PCR) of a dataset consisting of 155 fully analyzed biotite references that have chemistry and crystal structure refinement. The dataset is randomly grouped into a training set (75% in amount) and a test set (25% in amount). The training set is used to implement the structural formula and the test set is used to evaluate the performance of the model. The resulting linear regression coefficient matrix is then applied to calculate mole proportions of cations and anions of biotite samples using their compositional data from EMPA. Through this method, the distribution of the different cations and anions in the different sites can be calculated, including the tetrahedral Fe3+, octahedral Fe2+, octahedral Fe3+, OH and WO2− at the O(4) site. The O(4) site is assumed to be occupied by anions with a relation of 2 = F + Cl + OH + WO2−. Octahedral and interlayer vacancies could also be estimated in this model. The prediction quality for major elements is perfect with R2 > 0.95. The absolute errors in the estimated octahedral Fe2+, octahedral Al and OH at O(4) site are determined to be ±0.2 apfu (atom per formula unit based on 11O + 2(F, Cl, OH, O)), while those in total Fe3+ and WO2− at O(4) site are approximately ±0.3 apfu. A funnel-shaped relationship between absolute error in Fe3+/ΣFe ratio and FeOT wt% is observed, with the majority falling in the range of ±20%. Compared to previous normalization schemes, our model shows significant improvements in estimating Fe3+/ΣFe and WO2− at O(4) site. Our model is capable for calculating mineral formulae of common igneous and hydrothermal biotites, but not suitable for those that have been modified in a post-formation oxidation or reduction process. A supplementary Excel spreadsheet is provided that can be easily used for performing calculation from EMPA data.

AB - We present a new machine learning method for calculating biotite (sensu lato) structural formula from electron microprobe analysis (EMPA) data, which is based on principal components regression (PCR) of a dataset consisting of 155 fully analyzed biotite references that have chemistry and crystal structure refinement. The dataset is randomly grouped into a training set (75% in amount) and a test set (25% in amount). The training set is used to implement the structural formula and the test set is used to evaluate the performance of the model. The resulting linear regression coefficient matrix is then applied to calculate mole proportions of cations and anions of biotite samples using their compositional data from EMPA. Through this method, the distribution of the different cations and anions in the different sites can be calculated, including the tetrahedral Fe3+, octahedral Fe2+, octahedral Fe3+, OH and WO2− at the O(4) site. The O(4) site is assumed to be occupied by anions with a relation of 2 = F + Cl + OH + WO2−. Octahedral and interlayer vacancies could also be estimated in this model. The prediction quality for major elements is perfect with R2 > 0.95. The absolute errors in the estimated octahedral Fe2+, octahedral Al and OH at O(4) site are determined to be ±0.2 apfu (atom per formula unit based on 11O + 2(F, Cl, OH, O)), while those in total Fe3+ and WO2− at O(4) site are approximately ±0.3 apfu. A funnel-shaped relationship between absolute error in Fe3+/ΣFe ratio and FeOT wt% is observed, with the majority falling in the range of ±20%. Compared to previous normalization schemes, our model shows significant improvements in estimating Fe3+/ΣFe and WO2− at O(4) site. Our model is capable for calculating mineral formulae of common igneous and hydrothermal biotites, but not suitable for those that have been modified in a post-formation oxidation or reduction process. A supplementary Excel spreadsheet is provided that can be easily used for performing calculation from EMPA data.

KW - Biotite

KW - Mineral formula

KW - Oxidation state

KW - Principal component regression

KW - Ti-oxy substitution

UR - http://www.scopus.com/inward/record.url?scp=85077930392&partnerID=8YFLogxK

U2 - 10.1016/j.lithos.2020.105371

DO - 10.1016/j.lithos.2020.105371

M3 - Article

AN - SCOPUS:85077930392

VL - 356-357

JO - Lithos

JF - Lithos

SN - 0024-4937

M1 - 105371

ER -

By the same author(s)