cos is my alias
-
2009-01-03
发表文章摘抄
版权声明:转载时请以超链接形式标明文章原始出处和作者信息及本声明
http://cosismine.blogbus.com/logs/33340067.html
本文引用数据如下:
YX Liu and Ronald Rousseau. Definitions of time series in citation analysis with special attention to the h-index . Journal of Informetrics, 2(3), 2008, 202-210
Definitions of time series in citation analysis with special attention to the h-index
Yuxian Liu1,2 and Ronald Rousseau3,4,5
Abstract The structure of time series in citation analysis is revealed, using an adapted form of the Frandsen-Rousseau notation. Special cases where this approach can be used include time series of impact factors and time series of h-indices, or h-type indices. This leads to a tool to study the dynamic aspects of citation analysis. Time series of h-indices are calculated in some specific models. Keywords: time series, citation analysis, h-index, impact factors
1. IntroductionWe began this investigation when we realized that the study of the evolution of the h-index, h-type indicators and more generally citation indicators is a topic not yet fully addressed. Hirsch (2005) claimed that the life time achievement h-index of a scientist grows linearly in time and provided some evidence. By and large this evidence was corroborated by Kelly and Jennions (2006). Yet, as one needs more and more citations to attain a one-point higher h-index (the minimum number of citations needed for an h-index equal to h is h², hence for an increase by one point, from h to h+1, 2h+1 new citations are necessary) it seems intuitively clear that the growth of a scientist’s h-index should follow a concavely increasing curve as predicted by the Egghe-Rousseau power model (2006). Is Hirsch nevertheless correct and if so, why? This problem will not be addressed in this article, but, as a first step, we intend to provide precise definitions of time series for h-indices. Such definitions are necessary to avoid possible confusion. We provide a general scheme and notation for indicating exactly which time series is studied. Time series are used to better understand the underlying mechanism that produces them. They can also be used in forecasting. This aspect is interesting in the framework of research evaluation: how will a scientist or research group most likely perform in the future? Of course, the first question is: is this type of time series capable of predicting features that lie in the future? This question has been studied recently by Hirsch (2007) who found that the h-index series (our type 5, see further) is indeed a good predictor for future scientific achievements. When we started writing this contribution it became soon clear that in a similar way as for h-indices time series for journal impact factors can be defined. As there already exists a precise notation for all types of impact factors (Frandsen & Rousseau, 2005) we adapt it to the topic studied in this contribution. The article is organized as follows. The next section introduces the adaptation of the notations used by Frandsen and Rousseau (2005). Then time series of citation data, based on a publication-citation matrix (in short: p-c matrix) are defined and discussed in sections 3 and 4. These two sections contain the essential ideas of this article. Sections 5 and 6 focus on the h-index, presenting different time series of h-indices for two very simple models. We conclude in section 7.
2. The adapted Frandsen-Rousseau notation for publication and citation indicator calculations
We assume that the focus is on one set of articles. This set can be a scientist’s research record, a journal, as in most examples, but it can also be the set of all journals in one particular field, or even all journals in a database. For this journal or scientist we intend to calculate an impact factor or an h-index (or a similar indicator). It might seem somewhat odd (and in practice not recommended) to calculate a person’s impact factor, but, as long as this person publishes at least one article a year this is formally possible, and if the scientist does not publish that year the corresponding impact is naturally zero. Consider a p-c matrix (Ingwersen et al., 2001) consisting of N publication years, from year Y to year Y+N-1 (the columns) and M citation years, from year Y to year Y+M-1 (the rows). Hence the p-c matrix is a MxN-matrix. In theory a p-c matrix can be infinitely long in the two dimensions but we will stick to the realistic case of a matrix with a finite number of rows and columns, even when this makes some formulations somewhat complicated. In this article the words series and sequence will be used as synonyms. We only consider empirical data sets and do not consider probabilistic p-c models as studied in e.g. (Glänzel, 2004). In (Frandsen & Rousseau, 2005) a framework for describing general impact factors has been introduced, yet in its original form this framework cannot be used and an adaptation is necessary (we thank Leo Egghe for pointing this out to us). Consequently, the following quadruple is used instead: (Yp, np, Yc,q, nc,q), where Yp is the first year of the publication period;np denotes the length of the publication period;Yc,q is the first year (= oldest year) of the citation period for the qth publication year; q = 1, …, np, where q = 1 refers to the oldest year and q = np to the most recent year;nc,q denotes the length of the citation period for the qth publication year. This quadruple will be referred to as the F-R notation and refers to the elements necessary for the calculation of one h-index or one impact factor. If some publication data fall outside the limits of the p-c matrix the whole calculation is not performed. We assume here that time is considered in steps of one year, but the notation also applies for other time steps. We show how this notation can be used to describe series of impact factors and h-indices alike. Recall that citations are always drawn from a pool, such as the Web of Science, Scopus, a local database such as the Chinese CSCD, or subsets thereof. We will further assume that this pool is known and will not consider this aspect anymore.
3. Types of time series of citation indicators
We keep a publication set fixed and study series of citation indicators derived form this set. We define now general time series of indicators and characterize what they say about the set of publications. Time series are of the form , where k is an index ranging from time (year) 1 to some end time. As we consider several time series they are numbered by a superscript between square brackets. Specifics for each case are shown in Table 1. For a general element of the time series (the kth one) table 1 gives the following elements, in that order, necessary for the calculation of the index: the first year of the publication period (Yp), the length of the publication period (np), the first year of the citation period for the qth publication year (Yc,q) and the length of the citation period for the qth publication year (nc,q). Hence, for a given p-c matrix a time series is completely determined by a quadruple in the F-R notation. Note that the first, and in general the qth publication year may differ according to k, the element of the time series considered. If q is not mentioned in Yc,q or nc,q this just means that this year or this period does not depend on q, and hence is the same for all q. As we focus on the p-c matrix we do not include in this table the simple time series that uses only one publication year (or one publication) and one citation year for each element in the series, or the corresponding cumulative case (see e.g. Franses, 2003). For completeness sake we just mention that such time series are of the form sk = (Yp ,1, Yc+k-1 ,1), k = 1, …,M-Yc+Yp for the case of one citation year; or sk = (Yp , 1, Yc , k), k = 1, …,M-Yc+Yp, for the cumulative case, where usually Yp = Yc. Table 1. Characterizations of time series of citation indicators
Type Range of index (k) Data elements needed for the calculation of the k-th element of the sequence (F-R notation) 1 1 to N (Y+k-1, 1, Y+k-1, M-k+1) 2 1 to N (Y, k, Y+q-1, M-q+1) 3 1 to min(N-1,M-2) (Y+k-1, 2, Y+k+1,1) 4 1 to M (Y, min(k, N), Y+k-1, 1) 5 1 to M (Y, min(k,N), Y+q-1, k+1-q) 6 1 to M ( , min(N, M-k+1), Y+k+q-2, 1) 7 1 to M (Y, N, Y+q-1, min(k, M-q+1)) 8 1 to min(N,M-w+1);w > 0 being a givencitation window (Y+k-1,1, Y+k-1, w)* 9 1 to N-w+1; w > 0 being a given publication window (Y+k-1, w, Y+k+q-2, w-q+1) 10 1 to N (Y+N-k , k, Y+N-k+q-1, k-q+1), where M = N Type Range of index (k) Data elements needed for the calculation of the k-th element of the sequence (F-R notation) 4 1 to M 5 1 to M 6 1 to M (min(N, M-k+1),1, Y, Y+k+q-2) 7 1 to M (N, min(k, M-q+1), Y, Y+q-1) 8 1 to min(N,M-w+1);w > 0 being a givencitation window (1, w, Y+k-1, Y+k-1)* 9 1 to N-w+1; w > 0 being a given publication window (w,w-q+1,Y+k-1,Y+k+q-2) 10 1 to N (k, k-q+1,Y+N-k,Y+N-k+q-1), where M = N 随机文章:
元宵记忆 2009-02-09珍惜命运赋予的所有 2009-01-12领导能否肯定这个团体的工作? 2009-01-069岁男孩发烧没钱看病 用红领巾上吊自杀 2008-12-01课程表(curriculum schedule) 2008-08-30
收藏到:Del.icio.us








评论