Marco Piccirelli, S Spichtig, R S Vorburger, Michael Wolf, Near-infrared imaging sensor with improved handling and direct localization in simultaneous magnetic resonance imaging measurements, Journal of Innovative Optical Health Sciences, Vol. 4 (2), 2011. (Journal Article)
We present a novel optical sensor to acquire simultaneously functional near-infrared imaging (fNIRI) and functional magnetic resonance imaging (fMRI) data with an improved handling and direct localization in the MRI compared to available sensors. Quantitative phantom and interference
measurements showed that both methods can be combined without reciprocal adverse effects. The direct localization of the optical sensor on MR images acquired with a T1-weighted echo sequence simplifies the co-registration of NIRI and MRI data. In addition, the optical sensor
is simple to attach, which is crucial for measurements on vulnerable subjects. The fNIRI and T2*-weighted fMRI data of a cerebral activation were simultaneously acquired proving the practicability of the setup. |
|
Joseph P Romano, Azeem M Shaikh, Michael Wolf, Consonance and the closure method in multiple testing, The International Journal of Biostatistics, Vol. 7 (1), 2011. (Journal Article)
Consider the problem of testing s null hypotheses simultaneously. In order to deal with the multiplicity problem, the classical approach is to restrict attention to multiple testing procedures that control the familywise error rate (FWE). The closure method of Marcus et al. (1976) reduces the problem of constructing such procedures to one of constructing single tests that control the usual probability of a Type 1 error. It was shown by Sonnemann (1982, 2008) that any coherent multiple testing procedure can be constructed using the closure method. Moreover, it was shown by Sonnemann and Finner (1988) that any incoherent multiple testing procedure can be replaced by a coherent multiple testing procedure which is at least as good. In this paper, we first show an analogous result for dissonant and consonant multiple testing procedures. We show further that, in many cases, the improvement of the consonant multiple testing procedure over the dissonant multiple testing procedure may in fact be strict in the sense that it has strictly greater probability of detecting a false null hypothesis while still maintaining control of the FWE. Finally, we show how consonance can be used in the construction of some optimal maximin multiple testing procedures. This last result is especially of interest because there are very few results on optimality in the multiple testing literature. |
|
Olivier Ledoit, Michael Wolf, Robust Performance Hypothesis Testing with the Variance, In: Working paper series / Institute for Empirical Research in Economics, No. No. 516, 2010. (Working Paper)
"Applied researchers often test for the difference of the variance of two investment strategies; in particular, when the investment strategies under consideration aim to implement the global minimum variance portfolio. A popular tool to this end is the F-test for the equality of variances. Unfortunately, this test is not valid when the returns are correlated, have tails heavier than the normal distribution, or are of time series nature. Instead, we propose the use of robust inference methods. In particular, we suggest to construct a studentized time series bootstrap confidence interval for the ratio of the two variances and to declare the two variances different if the value one is not contained in the obtained interval. This approach has the advantage that one can simply resample from the observed data as opposed to some null-restricted data. A simulation study demonstrates the improved finite-sample performance compared to existing methods." |
|
Joseph P Romano, Azeem M Shaikh, Michael Wolf, Hypothesis testing in econometrics, Annual Review of Economics, Vol. 2 (1), 2010. (Journal Article)
This article reviews important concepts and methods that are useful for hypothesis testing. First, we discuss the Neyman-Pearson framework. Various approaches to optimality are presented, including finite-sample and large-sample optimality. Then, we summarize some of the most important methods, as well as resampling methodology, which is useful to set critical values. Finally, we consider the problem of multiple testing, which has witnessed a burgeoning literature in recent years. Along the way, we incorporate some examples that are current in the econometrics literature. While many problems with well-known successful solutions are included, we also address open problems that are not easily handled with current technology, stemming from such issues as lack of optimality or poor asymptotic approximations. |
|
Joseph P Romano, Michael Wolf, Balanced control of generalized error rates, Annals of Statistics, Vol. 38 (1), 2010. (Journal Article)
Consider the problem of testing s hypotheses simultaneously. In this paper, we derive methods which control the generalized family-wise error rate given by the probability of k or more false rejections, abbreviated k-FWER. We derive both single-step and step-down procedures that control the k-FWER in finite samples or asymptotically, depending on the situation. Moreover, the procedures are asymptotically balanced in an appropriate sense. We briefly consider control of the average number of false rejections. Additionally, we consider the false discovery proportion (FDP), defined as the number of false rejections divided by the total number of rejections (and defined to be 0 if there are no rejections). Here, the goal is to construct methods which satisfy, for given γ and α, P{FDP>γ}≤α, at least asymptotically. Special attention is paid to the construction of methods which implicitly take into account the dependence structure of the individual test statistics in order to further increase the ability to detect false null hypotheses. A general resampling and subsampling approach is presented which achieves these objectives, at least asymptotically. |
|
Joseph P Romano, Azeem M Shaikh, Michael Wolf, Hypothesis Testing in Econometrics, In: Working paper series / Institute for Empirical Research in Economics, No. No. 444, 2009. (Working Paper)
This paper reviews important concepts and methods that are useful for hypothesis testing. First, we discuss the Neyman-Pearson framework. Various approaches to optimality are presented, including finite-sample and large-sample optimality. Then, some of the most important methods are summarized, as well as resampling methodology which is useful to set critical values. Finally, we consider the problem of multiple testing, which has witnessed a burgeoning literature in recent years. Along the way, we incorporate some examples that are current in the econometrics literature. While we include many problems with wellknown successful solutions, we also include open problems that are not easily handled with current technology, stemming from issues like lack of optimality or poor asymptotic approximations. |
|
Michael Wolf, Dan Wunderli, Fund-of-Funds Construction by Statistical Multiple Testing Methods, In: Working paper series / Institute for Empirical Research in Economics, No. No. 445, 2009. (Working Paper)
Fund-of-funds (FoF) managers face the task of selecting a (relatively) small number of hedge funds from a large universe of candidate funds. We analyse whether such a selection can be successfully achieved by looking at the track records of the available funds alone, using advanced statistical techniques. In particular, at a given point in time, we determine which funds significantly outperform a given benchmark while, crucially, accouting for the fact that a large number of funds are examined at the same time. This is achieved by employing so-called multiple testing methods. Then, the equal-weighted or the global minimum variance portfolio of the outperforming funds is held for one year, after which the selection process is repeated. When backtesting this strategy on two particular hedge fund universes, we find that the resulting FoF portfolios have attractive return properties compared to the 1/N portfolio (that is, simply equal-weighting all the available funds) but also when compared to two investable hedge fund indices. |
|
Joseph P Romano, Azeem M Shaikh, Michael Wolf, Consonance and the Closure Method in Multiple Testing, In: Working paper series / Institute for Empirical Research in Economics, No. No. 446, 2009. (Working Paper)
Consider the problem of testing s hypotheses simultaneously. In order to deal with the multiplicity problem, the classical approach is to restrict attention to procedures that control the familywise error rate (FWE). Typically, it is known how to construct tests of the individual hypotheses, and the problem is how to combine them into a multiple testing procedure that controls the FWE. The closure method of Marcus et al. (1976), in fact, reduces the problem of constructing multiple test procedures which control the FWE to the construction of single tests which control the usual probability of a Type 1 error. The purpose of this paper is to examine the closure method with emphasis on the concepts of coherence and consonance. It was shown by Sonnemann and Finner (1988) that any incoherent procedure can be replaced by a coherent one which is at least as good. The main point of this paper is to show a similar result for dissonant and consonant procedures. We illustrate the idea of how a dissonant procedure can be strictly improved by a consonant procedure in the sense of increasing the probability of detecting a false null hypothesis while maintaining control of the FWE. We then show how consonance can be used in the construction of some optimal maximin procedures. |
|
Richard M Bittman, Joseph P Romano, Carlos Vallarino, Michael Wolf, Optimal testing of multiple hypotheses with common effect direction, Biometrika, Vol. 96 (2), 2009. (Journal Article)
We present a theoretical basis for testing related endpoints. Typically, it is known how to construct tests of the individual hypotheses, and the problem is how to combine them into a multiple test procedure that controls the familywise error rate. Using the closure method, we emphasize the role of consonant procedures, from an interpretive as well as a theoretical viewpoint, and introduce a new procedure, which is consonant and has a maximin property under the normal model. The results are then applied to PROactive, a clinical trial designed
to investigate the effectiveness of a glucose-lowering drug on macrovascular outcomes among patients with type 2 diabetes. |
|
Joseph P Romano, Azeem M Shaikh, Michael Wolf, Control of the False Discovery Rate under Dependence using the Bootstrap and Subsampling, In: Working paper series / Institute for Empirical Research in Economics, No. No. 337, 2008. (Working Paper)
This paper considers the problem of testing s null hypotheses simultaneously while controlling the false discovery rate (FDR). Benjamini and Hochberg (1995) provide a method for controlling the FDR based on p-values for each of the null hypotheses under the assumption that the p-values are independent. Subsequent research has since shown that this procedure is valid under weaker assumptions on the joint distribution of the p-values. Related procedures that are valid under no assumptions on the joint distribution of the p-values have also been developed. None of these procedures, however, incorporate information about the dependence structure of the test statistics. This paper develops methods for control of the FDR under weak assumptions that incorporate such information and, by doing so, are better able to detect false null hypotheses. We illustrate this property via a simulation study and two empirical applications. In particular, the bootstrap method is competitive with methods that require independence if independence holds, but it outperforms these methods under dependence. |
|
Olivier Ledoit, Michael Wolf, Robust performance hypothesis testing with the Sharpe ratio, Journal of Empirical Finance, Vol. 15 (5), 2008. (Journal Article)
Applied researchers often test for the difference of the Sharpe ratios of two investment strategies. A very popular tool to this end is the test of Jobson and Korkie (1981), which has been corrected by Memmel (2003). Unfortunately, this test is not valid when returns have tails heavier than the normal distribution or are of time series nature. Instead, we propose the use of robust inference methods. In particular, we suggest to construct a studentized time series bootstrap confidence interval for the difference of the Sharpe ratios and to declare the two ratios different if zero is not contained in the obtained interval. This approach has the advantage that one can simply resample from the observed data as opposed to some null-restricted data. A simulation study demonstrates the improved finite
sample performance compared to existing methods. In addition, two applications to real data are provided. |
|
Joseph P Romano, Azeem M Shaikh, Michael Wolf, Control of the false discovery rate under dependence using the bootstrap and subsampling, Test, Vol. 17 (3), 2008. (Journal Article)
This paper considers the problem of testing s null hypotheses simultaneously while controlling the false discovery rate (FDR). Benjamini and Hochberg (1995) provide a method for controlling the FDR based on p-values for each of the null hypotheses under the assumption that the p-values are independent. Subsequent research has since shown that this procedure is valid under weaker assumptions on the joint distribution of the p-values. Related procedures that are valid under no assumptions on the joint distribution of the p-values have also been developed. None of these procedures, however, incorporate information about the dependence structure of the test statistics. This paper develops methods for control of the FDR under weak assumptions that incorporate such information and, by doing so, are better able to detect false null hypotheses. We illustrate this property via a simulation study and two empirical applications. In particular, the bootstrap method is competitive with methods that require independence if independence holds, but it outperforms these methods under dependence. |
|
Richard M Bittman, Joseph P Romano, Carlos Vallarino, Michael Wolf, Optimal testing of multiple hypotheses with common effect direction, In: Working paper series / Institute for Empirical Research in Economics, No. No. 307, 2008. (Working Paper)
We present a theoretical basis for testing related endpoints. Typically, it is known hownto construct tests of the individual hypotheses, and the problem is how to combine them into a multiple test procedure that controls the familywise error rate. Using the closure method, we emphasize the role of consonant procedures, from an interpretive as well as a theoretical viewpoint. Suprisingly, even if each intersection test has an optimality property, the overallnprocedure obtained by applying closure to these tests may be inadmissible. We introduce annew procedure, which is consonant and has a maximin property under the normal model. The results are then applied to PROactive, a clinical trial designed to investigate the effectivenessnof a glucose-lowering drug on macrovascular outcomes among patients with type 2 diabetes.n |
|
Joseph P Romano, Michael Wolf, Balanced Control of Generalized Error Rates, In: Working paper series / Institute for Empirical Research in Economics, No. No. 379, 2008. (Working Paper)
"Consider the problem of testing s hypotheses simultaneously. In this paper, we derive methods which control the generalized familywise error rate given by the probability of k or more false rejections, abbreviated k-FWER. We derive both single-step and stepdown procedures that control the k-FWER in finite samples or asymptotically, depending on the situation. Moreover, the procedures are asymptotically balanced in an appropriate sense. We briefly consider control of the average number of false rejections. Additionally, we considernthe false discovery proportion (FDP), defined as the number of false rejections divided by the total number of rejections (and defined to be 0 if there are no rejections). Here, the goal is to construct methods which satisfy, for given γ and α, P{FDP > γ} ≤ α, at least asymptotically. Special attention attention is paid to the construction of methods which implicitly take into account the dependence structure of the individual test statistics in ordernto further increase the ability to detect false null hypotheses. A general resampling and subsampling approach is presented which achieves these objectives, at least asymptotically." |
|
Oliver Ledoit, Michael Wolf, Robust Performance Hypothesis Testing with the Sharpe Ratio, In: Working paper series / Institute for Empirical Research in Economics, No. No. 320, 2008. (Working Paper)
Applied researchers often test for the difference of the Sharpe ratios of two investmentnstrategies. A very popular tool to this end is the test of Jobson and Korkie (1981), whichnhas been corrected by Memmel (2003). Unfortunately, this test is not valid when returnsnhave tails heavier than the normal distribution or are of time series nature. Instead, wenpropose the use of robust inference methods. In particular, we suggest to construct a studentized time series bootstrap confidence interval for the difference of the Sharpe ratios and to declare the two ratios different if zero is not contained in the obtained interval. This approach has the advantage that one can simply resample from the observed data as opposed to some null-restricted data. A simulation study demonstrates the improved finite sample performance compared to existing methods. In addition, two applications to real data are provided. |
|
Joseph P Romano, Azeem M Shaikh, Michael Wolf, Formalized data snooping based on generalized error rates, Econometric Theory, Vol. 24 (2), 2008. (Journal Article)
It is common in econometric applications that several hypothesis tests are carried out simultaneously. The problem then becomes how to decide which hypotheses to reject, accounting for the multitude of tests. The classical approach is to control the familywise error rate (FWE) which is the probability of one or more false rejections. But when the number of hypotheses under consideration is large, control of the FWE can become too demanding. As a result, the number of false hypotheses rejected may be small or even zero. This suggests replacing control of the FWE by a more liberal measure. To this end, we review a number of recent proposals from the statistical literature. We briefly discuss how these procedures apply to the general problem of model selection. A simulation study and two empirical applications illustrate the methods. |
|
David Afshartous, Michael Wolf, Avoiding "data snooping" in multilevel and mixed effects models, Journal of the Royal Statistical Society: Series A, Vol. 170 (4), 2007. (Journal Article)
Multilevel or mixed effects models are commonly applied to hierarchical data. The level 2 residuals, which are otherwise known as random effects, are often of both substantive and diagnostic interest. Substantively, they are frequently used for institutional comparisons or rankings. Diagnostically, they are used to assess the model assumptions at the group level. Inference on the level 2 residuals, however, typically does not account for "data snooping", i.e. for the harmful effects of carrying out a multitude of hypothesis tests at the same time. We provide a very general framework that encompasses both of the following inference problems: inference on the "absolute" level 2 residuals to determine which are significantly different from 0, and inference on any prespecified number of pairwise comparisons. Thus, the user has the choice of testing the comparisons of interest. As our methods are flexible with respect to the estimation method that is invoked, the user may choose the desired estimation method accordingly. We demonstrate the methods with the London education authority data, the wafer data and the National Educational Longitudinal Study data. |
|
Michael Wolf, Joseph Romano, Control of generalized error rates in multiple testing, Annals of Statistics, 2007. (Journal Article)
|
|
Michael Wolf, David Afshartous, Avoiding 'Data Snooping' in Multilevel and Mixed Effects Models, Journal of the Royal Statistical Society. Series A: Statistics in Society, 2007. (Journal Article)
|
|
Stefan Boes, Three essays on the econometric analysis of discrete dependent variables, University of Zurich, Faculty of Economics, Business Administration and Information Technology, 2007. (Dissertation)
|
|