Robustness signifies insensitivity to small deviations
from assumptions. – Peter J. Huber
Oakleaf implements the robust statistical methods
in the form of a computer library.
It is developed for Munipack,
yet designed for a general use.
The estimators of the robust mean based on the order statistic
and M-estimates are included.
The regression and photon rates are being prepared.
Oakleaf is an open source project available under
LGPLv3.
It is developed in modern Fortran with regard to both reliability
and efficiency.
Interfaces to other programming and scripting languages are planned.
Oakleaf is used in the same fashion as other libraries:
Use
Include use oakleaf in the preambule
of the program. It provide declarations of routines.
Call
Call a desired subroutine.
Link
Link the program with the Oakleaf library. The
library depends on common system libraries and Minpack.
An illustration can be the program saved into hello.f08:
program hello
use oakleaf
real :: mean
call rmean([1.0,2.0,3.0],mean)
write(*,*) "Hello world, the mean is:",mean
end program hello
The program can be compiled, linked and run as:
$ gfortran -I/usr/include hello.f08 -L/usr/lib -loakleaf -lminpack
$ ./a.out
Hello world, the mean is: 2.00038409
The setup for paths, both the includes and libraries,
as -I/path/to/fortran/modules
or -L/path/to/libraries,
depends on the current instalation.
The sub-tree is /usr/local
if installed by hand,
whilst the packages for GNU/Debian or Ubuntu place it into
/usr, like the example.
The subroutine is generic, supporting
REAL32,
REAL64 and
REAL128 (available on 64-bit platforms) datatypes.
The reccomended, if the nature of a given problem allows it, is REAL64.
REAL32 has low accuracy, whilst REAL128 is slow.
Description
The robust mean subroutine rmean() estimates
the mean, and related statistical quantities.
The data are supposed to be sample from the
Normal distribution with possible contamination of a unknown origin.
The parameters for both the location and the dispersion of
the Normal distribution N(mean,stdsig) are estimated.
The true value X of the sample falls, with 68% probability,
into the interval mean - stderr < X < mean + stderr.
Parameters
On input:
data
array of data,
mean
(optional) an initial estimate of the mean,
scale
(optional) an initial estimate of the scale (i.e. stdsig),
reltol
(optional) the relative accuracy of determination
of the optimum (default: 2.4%),
robfun
(optional) select the robust function among:
`hampel' (default), `tukey', `huber'
or `square' (non-robust least square),
data are the mandatory input parameter.
The procedure initialisation is done internally
if rankinit == .true. (default);
otherwise, both mean,scale must be set
by the calling procedure.
The initial estimates are taken by the median and MAD
described in qmean.
The mandatory output parameter is the robust mean.
The check of the status is recommended.
Example
Save the program to the file example.f08:
program example
use oakleaf
use iso_fortran_env
real(REAL64), dimension(5) :: data = [ 1, 2, 3, 4, 5 ]
real(REAL64) :: mean,stderr,stdsig
call rmean(data,mean,stderr,stdsig)
write(*,*) 'rmean:',mean,' stderr:',stderr,'stdsig:',stdsig
end program example
Than compile and run it (the paths of -I, -L
are an example):
The report file keeps the detailed track of the algorithm
for visualisation or debugging purposes. It is not recommended
on a common use as it slow-down a run.
The format is (see src/sqpfmean.f08) self-descibing
and includes: initial estimates, the optimisation steps
(current values, trust-region radius, the step, the gradient, the quasi-Newton
hessian and the trust-step flag) and final number of iterations,
function evaluations, restores, steps out of the trust-region,
indefinite and hard steps and the final hessian computed analytically.
References
Hroch,F.: A contribution to the estimate of the robust mean value, in preparation
use oakleaf
subroutine qmean(data,mean,stderr,stdsig,flag)
real, dimension(:), intent(in) :: data
real, intent(out) :: mean
real, intent(out), optional :: stderr,stdsig
logical, intent(in), optional :: verbose
integer, intent(out), optional :: flag
end subroutine qmean
Description
This routine estimates both averadge and standard deviation
of a data set on base of the
order statistics,
i.e. quantiles.
The median, the Q(0.5) quantile, is claimed as the (robust) mean.
The standard deviation (and consequently the standard error)
is estimated preferably
by the Qn-estimator due Rousseeuw & Croux;
one is more durable on
highly contamined samples, yet additional memory space is required.
In other cases (N < 3 or N > 2**16),
the deviations is determined as MAD/0.6745,
where MAD is the median of absolute deviations.
Both the median and the Qn,MAD are robust estimators,
however they
are generally incompatible to least-square estimates, as being
granulated and deviated for contamined data.
The quick sort algorithm
is used for small samples N ≤ 42;
one requires N log N operations.
The larger samples are estimated by the quick median which needs
2N operations
(Wirth,D.: Algorithm and data structures, chapter Finding the Median).
The Qn-estimator requires N**2/2 elements.
Parameters
On input:
data
array of data
On output:
mean
mean by median
stderr
(optional) standard error
stdsig
(optional) standard deviation by MAD
verbose
(optional) print a detailed information
about the calulations.
program example
use oakleaf
real, dimension(5) :: data = [ 1, 2, 3, 4, 5 ]
real :: mean,stderr,stdsig
call qmean(data,mean,stderr,stdsig)
write(*,*) 'qmean:',mean,' stderr:',stderr,'stdsig:',stdsig
end program example
Wirth,D.: Algorithm and data structures,
Prentice-Hall International editions (1986)
Rousseeuw, Peter J and Croux, Christophe:
Alternatives to the median absolute deviation,
Journal of the American Statistical association vol. 88, 1273--1283 (1993)
$ wget https://integral.physics.muni.cz/pub/oakleaf/oakleaf-1.0.1.tar.gz
$ tar zxf oakleaf-1.0.1.tar.gz
The package build
The Oakleaf library should to be build in the following steps:
$ cd oakleaf-1.0.1/
$ ./configure FCFLAGS="-O2 -ffpe-trap=invalid,zero,overflow"
$ make
# make install # as root, or sudo
It can be later un-installed, if the build directory
still exists, by command
$ cd [some directory]/oakleaf-1.0.1/
# make uninstall # as root
The building can be customised via optional parameters of the configure.
The optimisation "-O2" is reccomended.
The library is installed
under /usr/local by default
(see configure --help).
If Minpack is not found, although installed, by ./configure,
try to add LDFLAGS="-L/usr/local/lib"
(eventually adjust the path) and check presence of the file
libminpack.a.
Smoke tests
The Oakleaf package has included a verification.
The basic tests can be executed as:
$ make check
The advanced testing, under a high voltage, is performed by
$ VOLTAGE=high make check
The placement
By default, the library is placed under /usr/local
according to the switch --prefix= of ./configure.
A program is compiled as
If the library is installed by packaging system,
--prefix=/usr is selected.
For GNU/Debian, as well as derivatives like Ubuntu or Mint, the compile step
requires only -I/usr/include:
gfortran -I/usr/include ..
The library can be found in /usr/lib/<arch>,
where the arch x86_64-linux-gnu is just for example.
Initial release as a fork of robust methods maintained for Munipack
(18. Jun. 2018)
Oakleaf 0.0.1
The first public release (22. Jan. 2019)
Added documentation.
Added automachinery.
Oakleaf 1.0.0
The implemenatation of the robust mean (31. Jan. 2025)
The robust mean routine has been completely reimplemented.
It is based on the developed method for optimisation
applied on the likelihood in two-dimensional space.
Oakleaf 1.0.1
The flexible implementation of the robust mean (13. Jan. 2025)
The release can be compiled under 32-bit platforms with
no support of the quadruple precision.
Robustness signifies insensitivity to small deviations
from assumptions. – Peter J. Huber
Oakleaf implements the robust statistical methods
in the form of a computer library.
It is developed for Munipack,
yet designed for a general use.
The estimators of the robust mean based on the order statistic
and M-estimates are included.
The regression and photon rates are being prepared.
Oakleaf is an open source project available under
LGPLv3.
It is developed in modern Fortran with regard to both reliability
and efficiency.
Interfaces to other programming and scripting languages are planned.
Oakleaf is used in the same fashion as other libraries:
Use
Include use oakleaf in the preambule
of the program. It provide declarations of routines.
Call
Call a desired subroutine.
Link
Link the program with the Oakleaf library. The
library depends on common system libraries and Minpack.
An illustration can be the program saved into hello.f08:
program hello
use oakleaf
real :: mean
call rmean([1.0,2.0,3.0],mean)
write(*,*) "Hello world, the mean is:",mean
end program hello
The program can be compiled, linked and run as:
$ gfortran -I/usr/include hello.f08 -L/usr/lib -loakleaf -lminpack
$ ./a.out
Hello world, the mean is: 2.00038409
The setup for paths, both the includes and libraries,
as -I/path/to/fortran/modules
or -L/path/to/libraries,
depends on the current instalation.
The sub-tree is /usr/local
if installed by hand,
whilst the packages for GNU/Debian or Ubuntu place it into
/usr, like the example.
The subroutine is generic, supporting
REAL32,
REAL64 and
REAL128 (available on 64-bit platforms) datatypes.
The reccomended, if the nature of a given problem allows it, is REAL64.
REAL32 has low accuracy, whilst REAL128 is slow.
Description
The robust mean subroutine rmean() estimates
the mean, and related statistical quantities.
The data are supposed to be sample from the
Normal distribution with possible contamination of a unknown origin.
The parameters for both the location and the dispersion of
the Normal distribution N(mean,stdsig) are estimated.
The true value X of the sample falls, with 68% probability,
into the interval mean - stderr < X < mean + stderr.
Parameters
On input:
data
array of data,
mean
(optional) an initial estimate of the mean,
scale
(optional) an initial estimate of the scale (i.e. stdsig),
reltol
(optional) the relative accuracy of determination
of the optimum (default: 2.4%),
robfun
(optional) select the robust function among:
`hampel' (default), `tukey', `huber'
or `square' (non-robust least square),
data are the mandatory input parameter.
The procedure initialisation is done internally
if rankinit == .true. (default);
otherwise, both mean,scale must be set
by the calling procedure.
The initial estimates are taken by the median and MAD
described in qmean.
The mandatory output parameter is the robust mean.
The check of the status is recommended.
Example
Save the program to the file example.f08:
program example
use oakleaf
use iso_fortran_env
real(REAL64), dimension(5) :: data = [ 1, 2, 3, 4, 5 ]
real(REAL64) :: mean,stderr,stdsig
call rmean(data,mean,stderr,stdsig)
write(*,*) 'rmean:',mean,' stderr:',stderr,'stdsig:',stdsig
end program example
Than compile and run it (the paths of -I, -L
are an example):
The report file keeps the detailed track of the algorithm
for visualisation or debugging purposes. It is not recommended
on a common use as it slow-down a run.
The format is (see src/sqpfmean.f08) self-descibing
and includes: initial estimates, the optimisation steps
(current values, trust-region radius, the step, the gradient, the quasi-Newton
hessian and the trust-step flag) and final number of iterations,
function evaluations, restores, steps out of the trust-region,
indefinite and hard steps and the final hessian computed analytically.
References
Hroch,F.: A contribution to the estimate of the robust mean value, in preparation
use oakleaf
subroutine qmean(data,mean,stderr,stdsig,flag)
real, dimension(:), intent(in) :: data
real, intent(out) :: mean
real, intent(out), optional :: stderr,stdsig
logical, intent(in), optional :: verbose
integer, intent(out), optional :: flag
end subroutine qmean
Description
This routine estimates both averadge and standard deviation
of a data set on base of the
order statistics,
i.e. quantiles.
The median, the Q(0.5) quantile, is claimed as the (robust) mean.
The standard deviation (and consequently the standard error)
is estimated preferably
by the Qn-estimator due Rousseeuw & Croux;
one is more durable on
highly contamined samples, yet additional memory space is required.
In other cases (N < 3 or N > 2**16),
the deviations is determined as MAD/0.6745,
where MAD is the median of absolute deviations.
Both the median and the Qn,MAD are robust estimators,
however they
are generally incompatible to least-square estimates, as being
granulated and deviated for contamined data.
The quick sort algorithm
is used for small samples N ≤ 42;
one requires N log N operations.
The larger samples are estimated by the quick median which needs
2N operations
(Wirth,D.: Algorithm and data structures, chapter Finding the Median).
The Qn-estimator requires N**2/2 elements.
Parameters
On input:
data
array of data
On output:
mean
mean by median
stderr
(optional) standard error
stdsig
(optional) standard deviation by MAD
verbose
(optional) print a detailed information
about the calulations.
program example
use oakleaf
real, dimension(5) :: data = [ 1, 2, 3, 4, 5 ]
real :: mean,stderr,stdsig
call qmean(data,mean,stderr,stdsig)
write(*,*) 'qmean:',mean,' stderr:',stderr,'stdsig:',stdsig
end program example
Wirth,D.: Algorithm and data structures,
Prentice-Hall International editions (1986)
Rousseeuw, Peter J and Croux, Christophe:
Alternatives to the median absolute deviation,
Journal of the American Statistical association vol. 88, 1273--1283 (1993)
$ wget https://integral.physics.muni.cz/pub/oakleaf/oakleaf-1.0.1.tar.gz
$ tar zxf oakleaf-1.0.1.tar.gz
The package build
The Oakleaf library should to be build in the following steps:
$ cd oakleaf-1.0.1/
$ ./configure FCFLAGS="-O2 -ffpe-trap=invalid,zero,overflow"
$ make
# make install # as root, or sudo
It can be later un-installed, if the build directory
still exists, by command
$ cd [some directory]/oakleaf-1.0.1/
# make uninstall # as root
The building can be customised via optional parameters of the configure.
The optimisation "-O2" is reccomended.
The library is installed
under /usr/local by default
(see configure --help).
If Minpack is not found, although installed, by ./configure,
try to add LDFLAGS="-L/usr/local/lib"
(eventually adjust the path) and check presence of the file
libminpack.a.
Smoke tests
The Oakleaf package has included a verification.
The basic tests can be executed as:
$ make check
The advanced testing, under a high voltage, is performed by
$ VOLTAGE=high make check
The placement
By default, the library is placed under /usr/local
according to the switch --prefix= of ./configure.
A program is compiled as
If the library is installed by packaging system,
--prefix=/usr is selected.
For GNU/Debian, as well as derivatives like Ubuntu or Mint, the compile step
requires only -I/usr/include:
gfortran -I/usr/include ..
The library can be found in /usr/lib/<arch>,
where the arch x86_64-linux-gnu is just for example.
Initial release as a fork of robust methods maintained for Munipack
(18. Jun. 2018)
Oakleaf 0.0.1
The first public release (22. Jan. 2019)
Added documentation.
Added automachinery.
Oakleaf 1.0.0
The implemenatation of the robust mean (31. Jan. 2025)
The robust mean routine has been completely reimplemented.
It is based on the developed method for optimisation
applied on the likelihood in two-dimensional space.
Oakleaf 1.0.1
The flexible implementation of the robust mean (13. Jan. 2025)
The release can be compiled under 32-bit platforms with
no support of the quadruple precision.