NEWS | R Documentation |
New default for the argument 'show' in functions hdda
and hddc
. Now 'show' is set to the value of getHDclassif.show()
which is FALSE
at package loading. To set the value of 'show' permanently, use the new function setHDclassif.show
.
More explicit warnings when the maximum number of iterations is reached in HDDC.
Convergence criterion in HDDC modified to accomodate large numbers.
Algorithm now stops when there is log-likelihood alternation.
[hdclassif_dim_choice (thanks to Zhenfeng He)] Bug in finding the intrisic dimension with the BIC method could occur when **all** eigenvalues are smaller than 1e-8 (value of noise.ctrl). Now corrected.
[plot.hdc] plot method when selection has been made with the BIC criterion now works appropriately.
[plot.hdc] Now graphical parameters are reset at the end of the plot even in case of an interruption.
[SlopeHeuristic (thanks to Thibaut Marin-Cudraz)] There was an important issue in the computation of the slope heuristic, undermining its validity in some situations. This is now corrected.
[hddc (thanks to Vibhu Agarwal)] There was a problem to handle data sets of less than 2 variables. Now corrected.
[hddc (thanks to Vibhu Agarwal)] There was a problem when performing the Cattell scree test with 2 or less eigenvalues. Now corrected.
[slopeHeuristic] A new option, plot
, displays the slope heuristic: both the fit of the likelihoods and the value of the slope heuristic criterion.
[predict] The likelihood has been added as an output.
[hddc] New argument “subset”: it allows to perform HDDC on a subset of the data before computing the posterior on the full sample. Can be useful for having quick results on large datasets.
[hddc] Added a warning when the maximum number of iterations is reached.
Some cleanup and rewritting.
[hddc] there were problems when the dimension of the data was greater than d_max.
[hddc] now the hddc initialization init="vector" works properly.
[hdmda] now function hdmda works for any model.
[hdda] problem when model="ALL" in hdda is fixed
[hdda] Some rewriting regarding the option d_select.
[demo] now demo(hddc) works properly.
[hddc] now the value of the "noise" b, cannot be lower than the parameter noise.ctrl.
[slopeHeuristic] Added: the slope heuristic to choose among different models in hddc
[hddc] Added ICL criterion as an output for hddc
[hddc] Added native parallel computing for hddc => new argument mc.cores
[hddc] Controls for the kmeans initialization are more easily handled => there is now a kmeans.control argument in hddc
[hddc] In hddc: The threshold of the Cattel scree test (argument threshold) can now be a vector.
[hddc] Added in hddc: argument nbrep: the number of repetitions for each combination of model/K/threshold, only the best BIC is selected for these combinations
[hddc] Added: an explicit argument "init.vector" for user made initializations
significant improvement in the algorithm for large datasets => new argument d_max that controls the maximal number of intrinsic dimensions that are computed for Cattel's scree test
[hdmda] the hdmda method for classification is added (supervised classification by using HDDC within the classes)
slight improvement of error handling
slight changes in the help files
[hddc] The ARI criterion is introduced in the function predict for hddc objects. It is a criterion to assess the goodness of fit of the clustering. ARI completely replaces the former "best classication rate", as the algorithm used to compute it was flawed.
The readability of help files is improved.
[hddc] now hddc will stop if the number of potential individuals in a class is inferior to 2; if so it will give the message 'empty class' which means stricly less than 2 individuals.
[hddc] changing the name of the argument 'ctrl' in hddc to 'min.individuals' for a clearer meaning. Now the argument 'min.individuals' is the minimum NUMBER of individuals in a class, it is not a PERCENTAGE of the total nomber of observations anymore. Its value cannot be lower than 2.
[hddc & hdda] in hddc and hdda the option 'dim.ctrl' is now named 'noise.ctrl' for a clearer meaning
[hddc] now some errors are now handled properly by the function hddc
[hddc] now some errors are now handled properly by the function hddc
[hddc] corrected a bug when using common dimension models which could lead to a selection of the intrinsic dimension of 0
[hddc] imoprtant in hddc.Rd: mismatch between the code and the help files; the stopping criterion (eps) now is correctly described as 'the difference between two successive log likelihoods' (as it is in the code) instead of 'the difference between two successive log likelihood per observation'
some rewritting of help files
the citation of the package is updated
new reference in .Rd files
now the BIC and the log likelihood are not divided by N (the number of observation) anymore
very slight changes in the random initialization of hddc (now the random init cannot begin with an empty class)
[hdda] added some features to hdda, notably the model "all" and the V-fold cross validation for dimension selection
[hdda] a cross-validation option has been added for hdda in order to select the best dimension or threshold with respect to the CV result
[hdda] added a leave-one-out cross-validation option to hdda
[plot.hdc] big changes in the function plot.hdc. Now the dimensions selection using either Cattell's scree-test or the BIC can be plotted.
[plot.hdc] The graph of the eigenvalues has been removed.
[plot.hdc] Graph scale changed for Cattell's scree-test to see directly the threshold levels
[hddc] now it is possible to select the dimension with the "bic" criterion in hddc
[hddc] added some warnings when the value of the parameter b is very low (inferior to 1e-6)
the callculation trick when N<p is now done since Ni<p, Ni being the number of observations for the class i
changes on the predict.hdc function. It now works all the time.
Big rewriting.
[hddc] hddc can now be initialized with a given class vector
slight change in the demo functions
the description of the package is changed
the models can now be selected using integers instead of names
the graph of hddc now gives the comparison between different models and different number of clusters
the calculation of the log likelihood has been modified in hddc
When several models are given, HDDA and HDDC now explicitly give the model they select
[hddc] The initialization kmeans can be settled by the user using the dots argument: ...
[hddc] hddc now handles several models at once
A demo has been built for the methods hdda and hddc
[plot] A plot minor issue is fixed
Some names are changed in the functions hdda and hddc :
Former name -> New name
AkiBkQkDk -> AkjBkQkDk
AkiBQkDk -> AkjBQkDk
AkiBkQkD -> AkjBkQkD
AkiBQkD -> AkjBQkD
AiBQD -> AjBQD