The Poisson-G Family of Distributions with Applications

We define and study a new class of continuous distributions called the Poisson-G family. We present three of its several special models. Some of its mathematical properties including explicit expressions for the ordinary and incomplete moments, quantile and generating functions and entropies are provided. The estimations of the model parameters is carried out using maximum likelihood method. The flexibility of the new family is illustrated by means of two applications to real data sets.


Introduction
Recently, Many generalized families of distributions have been proposed and extensively used in modeling data in various applied sciences such as economics, finance, insurance, engineering and life testing. However, there is a clear need for extended forms of these distributions by adding one or more shape parameter(s) in order to obtain greater flexibility in modelling these data.
There are many well-known families in the literature. For example, the exponentiated-G In this paper, we propose and study the Poisson-G (Po-) family of distributions. The main advantage of the Po-family relies on the fact that practitioners will have a quite flexible one-parameter class to fit real life data in applied fields. It may serve as a good alternative to other one, two or three-parameter families. It also may work better, in terms of model fitting, than other classes of distributions in certain practical situations which cannot always be guaranteed. Furthermore, a comprehensive account of some of its mathematical properties are provided. We prove empirically that the special models of the Po-family can provide better fit than other competitive models generated by the above mentioned classes.
Let ( ) be the probability density function (pdf) of a random variable ∈ , ] for −∞ < < < ∞ and let [ ( )] be a function of the cumulative distribution function (cdf) of a random variable such that [ ( )] satisfies the following conditions: Recently, Alzaatreh et al. (2013) proposed the T-X family with cdf where [ ( )] satisfies the conditions (1). The corresponding pdf of (2) is defined by Based on generalized exponential power series distribution (Mahmoudi and Jafari, 2012), first we define the Po-family of distributions.
where ( ; ) is the baseline cdf depending on a parameter vector and > 0 is a shape parameter.
The rest of the paper is outlined as follows. In Section 2, we derive a very useful representation for the Po-density function. Three special models of this family are presented in Section 3 and some plots of their pdfs and hrfs are given. We obtain in Section 4 some general mathematical properties of the proposed family including ordinary and incomplete moments, quantile function (qf), moment generating function (mgf) and entropies. Maximum likelihood estimation of the model parameters is investigated in Section 5. In Section 6, simulation results to assess the performance of the proposed maximum likelihood estimation procedure are discussed for one special model. In Section 7, we perform two applications to real data sets to illustrate the potentiality of three special models of the proposed family. Finally, some concluding remarks are presented in Section 8.

Linear representation
In this section, we provide a useful mixture representation for the Po-family in terms of Exp-densities. The pdf (5) can be expressed as Using the power series for exp[ ( )], the pdf of the Po-family can be expressed as Then, we can write the pdf of the Po-family as Equation (6) reveals that the density function of is a mixture of the baseline density and the Exp-density with power parameter + 1. Thus, some structural properties of the Po-family such as ordinary and incomplete moments, mean deviations and generating function can be determined from those properties of the Exp-distributions.

Special models
In this section, we provide three special models of the Po-family. The pdf (5) will be most tractable when ( ; ) and ( ; ) have simple analytic expressions. These special models generalize some well-known distributions corresponding to the baseline Weibull (W), additive Weibull (AW) and Burr X (BX) distributions.

The PoW Model
Consider the cdf and pdf (for

The PoAW Model
The AW distribution with parameters , , , ≥ 0where < 1 < (or < 1 < ), and are the shape parameters and and are scale parameters, has pdf and cdf (for respectively. Figure 2 displays plots of the PoAW density and its hrf for selected parameter values.

The PoBX Model
The BX (also known as the generalized Raleigh) model with positive parameters and has cdf and pdf (for .
Plots of the PoBX density and its hrf for selected parameter values are displayed in Figure 3.

Properties
In this section, we derive some mathematical properties of the proposed family based on the linear representation derived in Section 2.

Ordinary and incomplete moments
The th moment of , say ′ , follows from (6) as

Quantile and generating functions
The qf of , where ~Po-( , ), is obtained by inverting (4) to obtain ( ) = −1 , 0 < < 1. Simulating the Po random variable is straightforward. If is a uniform variate on the unit interval (0,1), then the random variable = ( ) follows (4). For simulating from Po-if ~(0,1), then solution of nonlinear equation Now, we provide two formulae for the mgf ( ) = ( ) of . Clearly, the first one can be derived from equation (6) as where +1 ( ) is the mgf of +1 . Hence, ( ) can be determined from the Expgenerating function. A second formula for ( ) follows from (6)

Entropies
The Rényi entropy of a random variable represents a measure of variation of the uncertainty. The Rényi entropy is defined by Using the pdf ( The -entropy, say ( ), can be obtained as where > 0, ≠ 1.

Maximum likelihood estimation
Here, we determine the MLEs of the parameters of the new family of distributions from complete samples only. Let 1 , … , be a random sample from the Po-family with parameters and . Let Θ = ( , T ) be the ( × 1) parameter vector. Then, the loglikelihood function for Θ, say ℓ = ℓ(Θ), is given by Equation (8)  Under standard regularity conditions when → ∞, the distribution of Θ can be approximated by a multivariate normal (0, (Θ) −1 ) distribution to obtain confidence intervals for the parameters. Here, (Θ) is the total observed information matrix evaluated at Θ. The method of the re-sampling bootstrap can be used for correcting the biases of the MLEs of the model parameters. Good interval estimates may also be obtained using the bootstrap percentile method. The elements of (Θ) are given by

Simulation
In this section for different combination of , and , samples of sizes = 50,100,200,500 and 1000 are generated from the PoW distribution. From the 1000 repetition we calculated the mean and the root mean square errors (RMSEs) of each parameters. Table 1 provides the results of the simulation results for two different combination of the PoW parameters. It can be clearly observed from these data that as sample size increases the mean square error decreases, it proves the consistency of the estimators.

Real data analysis
In this section, we provide two applications to real data to illustrate the flexibility of the PoW, PoAW and PoBX models presented in Section 3. The goodness-of-fit statistics for these models are compared with other competitive models and the MLEs of the model parameters are determined. In order to compare the fitted models, we consider some goodness-of-fit measures including the Akaike information criterion ( ), consistent Akaike information criterion ( ), Hannan-Quinn information criterion ( ), Bayesian information criterion ( ) and −2l, where l is the maximized log-likelihood. Further, we adopt the Anderson-Darling ( * ) and Cramér-von Mises ( * ) statistics in order to compare the fits of the two new models with other nested and non-nested models. The smaller these statistics are, the better the fit.

The nicotine data
The first data set refers to nicotine measurements, made from several brands of cigarettes in 1998, collected by the Federal Trade Commission which is an independent agency of the US government. The free form data set can be found at http://pw1.netcom.com/rdavis2/smoke.html. The site http://home.att.net/rda vis2/cigra.ht ml contains n = 346 observations. We compare the fit of the PoW and PoAW distributions with those of other competitive models, namely: the additive Weibull (AW), beta Weibull (BW), transmuted Weibull Lomax (TWL), transmuted exponentiated generalized Weibull (TExGW), Kumaraswamy Weibull (Kw-W), New modified Weibull (NMW) and W distributions. The pdfs of these models are given in Appendix A.

The gauge lengths data
The second data set (gauge lengths of 20 mm) (Kundu and Raqab, 2009) consists of 74 observations. For these data, we compare the fit of the PoW and PoBX distributions with those of the generalized transmuted Burr X (GTBX), exponentiated transmuted generalized Rayleigh (ETGR), TGR, GR and Rayleigh (R) models. Tables 2 and 4 list the values of −2l, , , , , * and * , whereas the MLEs and their corresponding standard errors (in parentheses) of the model parameters are given in Tables 3 and 5.
In Table 2, we compare the fits of the PoW and PoAW models with the AW, BW, TWL, Kw-W, NMW, ETGR and Mc-W distributions. We note that the PoW and PoAW distributions have the lowest values for the −2l, , , , , * and * statistics (for nicotine data) among all fitted models. So, the PoW and PoAW models could be chosen as the best models. In Table 4, we compare the fits of the PoW and PoBX models with the GTBX, ETGR, TGR, GR and R models. The figures in this table reveal that the PoW and PoBX models have the lowest values for −2l, , , , , * and * statistics (for gauge lengths data) among all fitted models. So, the PoW and PoBX distributions can be chosen as the best models. So, we prove that these new distributions can be better models than other competitive lifetime models. The histogram and the estimated densities for nicotine and gauge lengths data are displayed in Figure 4. QQ-plots for the best fitted distributions are shown in Figure 5. From these graphs it is evident that the PoW, PoAW and PoBX distributions best describe both data sets.

Conclusions
The idea of generating new extended models from classic ones has been of great interest among researchers in the past decade. We present a new Poisson-G (Po-) family of distributions. We provide some mathematical properties of the new family including explicit expansions for the ordinary and incomplete moments, quantile and generating functions and entropies. The maximum likelihood estimation of the model parameters is investigated and the observed information matrix is determined. By means of two real data sets, we verify that special cases of the Po-family can provide better fits than other models generated from well-known families.