Mathematik, Informatik und Statistik - Open Access LMU - Teil 03/03 - podcast cover

Mathematik, Informatik und Statistik - Open Access LMU - Teil 03/03

Ludwig-Maximilians-Universität Münchenepub.ub.uni-muenchen.de
Die Universitätsbibliothek (UB) verfügt über ein umfangreiches Archiv an elektronischen Medien, das von Volltextsammlungen über Zeitungsarchive, Wörterbücher und Enzyklopädien bis hin zu ausführlichen Bibliographien und mehr als 1000 Datenbanken reicht. Auf iTunes U stellt die UB unter anderem eine Auswahl an elektronischen Publikationen der Wissenschaftlerinnen und Wissenschaftler an der LMU bereit. (Dies ist der 3. von 3 Teilen der Sammlung 'Mathematik, Informatik und Statistik - Open Access LMU'.)
Last refreshed:
Follow this podcast in the Metacast mobile app to refresh it and see new episodes.
Download Metacast podcast app
Podcasts are better in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episodes

A General Framework for the Selection of Effect Type in Ordinal Regression 1/2

In regression models for ordinal response, each covariate can be equipped with either a simple, global effect or a more flexible and complex effect which is specific to the response categories. Instead of a priori assuming one of these effect types, as is done in the majority of the literature, we argue in this paper that effect type selection shall be data-based. For this purpose, we propose a novel and general penalty framework that allows for an automatic, data-driven selection between global...

Jan 18, 20160

A General Framework for the Selection of Effect Type in Ordinal Regression 2/2

In regression models for ordinal response, each covariate can be equipped with either a simple, global effect or a more flexible and complex effect which is specific to the response categories. Instead of a priori assuming one of these effect types, as is done in the majority of the literature, we argue in this paper that effect type selection shall be data-based. For this purpose, we propose a novel and general penalty framework that allows for an automatic, data-driven selection between global...

Jan 18, 20160

Identifiability in penalized function-on-function regression models

Regression models with functional responses and covariates constitute a powerful and increasingly important model class. However, regression with functional data poses well known and challenging problems of non-identifiability. This non-identifiability can manifest itself in arbitrarily large errors for coefficient surface estimates despite accurate predictions of the responses, thus invalidating substantial interpretations of the fitted models. We offer an accessible rephrasing of these identif...

Jan 01, 20160

Identifiability in penalized function-on-function regression models

Regression models with functional covariates for functional responses constitute a powerful and increasingly important model class. However, regression with functional data poses challenging problems of non-identifiability. We describe these identifiability issues in realistic applications of penalized linear function-on-function-regression and delimit the set of circumstances under which they arise. Specifically, functional covariates whose empirical covariance has lower effective rank than the...

Jun 11, 20150

What can the Real World do for simulation studies? A comparison of exploratory methods

For simulation studies on the exploratory factor analysis (EFA), usually rather simple population models are used without model errors. In the present study, real data characteristics are used for Monte Carlo simulation studies. Real large data sets are examined and the results of EFA on them are taken as the population models. First we apply a resampling technique on these data sets with sub samples of different sizes. Then, a Monte Carlo study is conducted based on the parameters of the popula...

Apr 14, 20150

Estimating individual treatment effects from responses and a predictive biomarker in a parallel group RCT

When being interested in administering the best of two treatments to an individual patient i, it is necessary to know the individual treatment effects (ITEs) of the considered subjects and the correlation between the possible responses (PRs) for two treatments. When data are generated in a parallel–group design RCT, it is not possible to determine the ITE for a single subject since we only observe two samples from the marginal distributions of these PRs and not the corresponding joint distributi...

Dec 24, 20140

Minimization and estimation of the variance of prediction errors for cross-validation designs 2/2

We consider the mean prediction error of a classification or regression procedure as well as its cross-validation estimates, and investigate the variance of this estimate as a function of an arbitrary cross-validation design. We decompose this variance into a scalar product of coefficients and certain covariance expressions, such that the coefficients depend solely on the resampling design, and the covariances depend solely on the data's probability distribution. We rewrite this scalar product i...

Nov 01, 20140

Minimization and estimation of the variance of prediction errors for cross-validation designs 1/2

We consider the mean prediction error of a classification or regression procedure as well as its cross-validation estimates, and investigate the variance of this estimate as a function of an arbitrary cross-validation design. We decompose this variance into a scalar product of coefficients and certain covariance expressions, such that the coefficients depend solely on the resampling design, and the covariances depend solely on the data's probability distribution. We rewrite this scalar product i...

Nov 01, 20140

Possibilities and Limitations of Spatially Explicit Site Index Modelling for Spruce Based on National Forest Inventory Data and Digital Maps of Soil and Climate in Bavaria (SE Germany)

Combining national forest inventory (NFI) data with digital site maps of high resolution enables spatially explicit predictions of site productivity. The aim of this study is to explore the possibilities and limitations of this database to analyze the environmental dependency of height-growth of Norway spruce and to predict site index (SI) on a scale that is relevant for local forest management. The study region is the German federal state of Bavaria. The exploratory methods comprise significanc...

Nov 01, 20140

A variance decomposition and a Central Limit Theorem for empirical losses associated with resampling designs

The mean prediction error of a classification or regression procedure can be estimated using resampling designs such as the cross-validation design. We decompose the variance of such an estimator associated with an arbitrary resampling procedure into a small linear combination of covariances between elementary estimators, each of which is a regular parameter as described in the theory of $U$-statistics. The enumerative combinatorics of the occurrence frequencies of these covariances govern the l...

Nov 01, 20140

Modeling Clustered Heterogeneity: Fixed Effects, Random Effects and Mixtures

Although each statistical unit on which measurements are taken is unique, typically there is not enough information available to account totally for its uniqueness. Therefore heterogeneity among units has to be limited by structural assumptions. One classical approach is to use random effects models which assume that heterogeneity can be described by distributional assumptions. However, inference may depend on the assumed mixing distribution and it is assumed that the random effects and the obse...

Oct 30, 20140

Improved Methods for the Imputation of Missing Data by Nearest Neighbor Methods

Missing data is an important issue in almost all fields of quantitative research. A nonparametric procedure that has been shown to be useful is the nearest neighbor imputation method. We suggest a weighted nearest neighbor imputation method based on Lq-distances. The weighted method is shown to have smaller imputation error than available NN estimates. In addition we consider weighted neighbor imputation methods that use selected distances. The careful selection of distances that carry informati...

Oct 13, 20140

Tree-Structured Modelling of Categorical Predictors in Regression

Generalized linear and additive models are very efficient regression tools but the selection of relevant terms becomes difficult if higher order interactions are needed. In contrast, tree-based methods also known as recursive partitioning are explicitly designed to model a specific form of interaction but with their focus on interaction tend to neglect the main effects. The method proposed here focusses on the main effects of categorical predictors by using tree type methods to obtain clusters. ...

Aug 12, 20140

Categorical variables with many categories are preferentially selected in model selection procedures for multivariable regression models on bootstrap samples

To perform model selection in the context of multivariable regression, automated variable selection procedures such as backward elimination are commonly employed. However, these procedures are known to be highly unstable. Their stability can be investigated using bootstrap-based procedures: the idea is to perform model selection on a high number of bootstrap samples successively and to examine the obtained models, for instance in terms of the inclusion of specific predictor variables. However, f...

Aug 07, 20140

The linear GMM model with singular covariance matrix due to the elimination of a nuisance parameter

When in a linear GMM model nuisance parameters are eliminated by multiplying the moment conditions by a projection matrix, the covariance matrix of the model, the inverse of which is typically used to construct an efficient GMM estimator, turns out to be singular and thus cannot be inverted. However, one can show that the generalized inverse can be used instead to produce an efficient estimator. Various other matrices in place of the projection matrix do the same job, i.e., they eliminate the nu...

Jun 30, 20140

Variable Selection for Discrete Competing Risks Models

In competing risks models one distinguishes between several distinct target events that end duration. Since the effects of covariates are specific to the target events, the model contains a large number of parameters even when the number of predictors is not very large. Therefore, reduction of the complexity of the model, in particular by deletion of all irrelevant predictors, is of major importance. A selection procedure is proposed that aims at selection of variables rather than parameters. It...

May 28, 20140
For the best experience, listen in Metacast app for iOS or Android