The Transmuted Geometric-Weibull distribution: Properties, Characterizations and Regression Models

We propose a new lifetime model called the transmuted geometric-Weibull distribution. Some of its structural properties including ordinary and incomplete moments, quantile and generating functions, probability weighted moments, Rényi and q-entropies and order statistics are derived. The maximum likelihood method is discussed to estimate the model parameters by means of Monte Carlo simulation study. A new location-scale regression model is introduced based on the proposed distribution. The new distribution is applied to two real data sets to illustrate its flexibility. Empirical results indicate that proposed distribution can be alternative model to other lifetime models available in the literature for modeling real data in many areas.


Introduction
The Weibull distribution has an undeniable popularity in probability and statistics due to its versatility of modeling real world data. Yet there are many cases where the classical Weibull distribution is unable to capture true phenomenon under study. Therefore, several of its generalizations have been proposed and studied. A generalized form of Weibull distribution is obtained by inducting one or more parameter(s) to the twoparameter Weibull distribution. It has been proven that several of these generalized distribution are more flexible and are capable of modeling real world data better than the classical Weibull distribution. A state-of-the-art survey on the class of such generalized Weibull distributions can be found in Lai et al. (2001) and Nadarajah (2009). Some generalization of the Weibull distribution studied in the literature includes, but are not limited to, exponentiated Weibull (Mudholkar and Srivastava, 1993;Mudholkaret al. 1995;Mudholkar, Srivastava et al. 1996), additive Weibull (Xie and Lai, 1995), Marshall-Olkin extended Weibull (Ghitany et al. 2005), beta Weibull (Famoye et al. 2005), modified Weibull (Sarhan and Zaindin, 2009) where [ ( )] satisfies conditions (1). The pdf corresponding to (2) is given by According to Afify et al. (2016) the cdf of the TG-G family is given by where ( ; ϕ) and > 0, | | ≤ 1 are two additional shape parameters. The TG-G is a wider class of continuous distributions. It includes the transmuted-G family of distributions and geometric-G. Concider the cdf of the Weibull (W) distribution, The pdf corresponding of (4) is given by ].
The rest of the paper is organized as follows: In Section 2, some mathematical properties of the TGW are obtained such as mixture representation, quantile function, moments, order statistics and reliability estimation. Section 4 is devoted to characterizations of the proposed distribution and in Section 5 estimation of model parameters by the maximum likelihood method is presented. In Section 5, brief Monte-Carlo simulation study is performed to estimate model parameters with maximum likelihood estimators (MLE).
The log-transmuted geometric-Weibull regression model is defined in Section 6. Section 7 is devoted to applications to illustrate the flexibility of the proposed distribution in many fields such as univariate data fitting and survival analysis. Finally, some concluding remarks are given in Section 8.

Mathematical Properties
In this section some mathematical properties of the TWG distribution is discussed.

Survival and Hazard Functions
Central role is played in the reliability theory by the quotient of the pdf and survival function. We obtain the survival function corresponding to (4) as In reliability studies, the hazard rate function (hrf) is an important characteristic and fundamental to the design of safe systems in a wide variety of applications. Therefore, we discuss these properties of the TGW distribution. The hrf of X takes the form In Figure 1, we display some plots of the pdf and hrf of the TGW distribution for selected parameter values. Figure 1 reveals that the TGW density generate various shapes such as right-skewed, reversed-J, and unimodal. Figure 1 also shows that the TGW distribution can produce hazard rate shapes such as increasing, decreasing, and reversed-J. This fact implies that the TGW distribution can be very useful for fitting data sets with various shapes. These equations show the effect of parameters on tails of TGW distribution.

Mixture Representation
In this section, we provide a very useful representation for the TG-W density. The pdf in (5) can be rewritten as Then, the pdf in (6) can be rewritten as the pdf (7) can be expressed as a mixture of exp-W density But The cdf of the TG-W distribution can also be expressed as a mixture of exp-W densities. By integrating (8), we obtain the same mixture representation where Π ( ) = (1 − e −( ) ) is the cdf of the exp-W density with power parameter .
Simulating the TG-W random variable is straightforward. If is a uniform variate on the unit interval (0,1), then the random variable = ( ) follows 6.
The effects of the shape parameters on the skewness and kurtosis can be based on quantile measures. We obtain skewness and kurtosis measures using the qf. The Bowley's skewness measure is given by These measures enjoy the advantage of having less sensitivity to outliers. Moreover, they do exist for distribution without moments. Both measures equal zero for the normal distribution. Plots of skewness and kurtosis of the TGW distribution are presented in Figure 2. These plots indicate that both measures depend very much on the shape parameters. Therefore TGW distribution can model various data types in terms of skewness and kurtosis. Figure 2. Plots of skewness and kurtosis of TGW distribution for = 2 and = 2.

Moments and Generating Function
The th moment of , say ′ , follows from (9) as Henceforth, denotes the exp-G distribution with power parameter . The th central moment of , say , is given by The cumulants ( ) of follow recursively from and kurtosis 2 = 4 / 2 2 are obtained from the third and fourth standardized cumulants. The th descending factorial moment of (for = 1,2, …) is is the Stirling number of the first kind. The mgf ( ) = ( ) of can be derived from equation (8) as where ( ) is the mgf of . Hence, ( ) can be determined from the exp-G generating function. Then

Incomplete Moments and Mean Deviations
The main applications of the first incomplete moment refer to the mean deviations and the Bonferroni and Lorenz curves. These curves are very useful in economics, reliability, demography, insurance and medicine. The th incomplete moment, say ( ), of can be expressed from (8) as (1 + , ( ) ) .

(9)
The mean deviations about the mean and about the median is easily calculated from (4) and 1 ( ) is the first incomplete moment given by (9) with = 1.

Order Statistics
Order statistics make their appearance in many areas of statistical theory and practice. Let 1 , … , be a random sample from the TG-W distributions. The pdf of th order statistic, say : , can be written as Then Using (5) and (11) we get Substituting (12) in Equation (10), the pdf of : can be expressed as and ( ) is the exp-W density with power parameter . Then, the density function of the TG-G order statistics is a mixture of exp-G densities. Based on the last equation, we note that the properties of : follow from those properties of + . For example, the moments of : can be expressed as where The L-moments are analogous to the ordinary moments but can be estimated by linear combinations of order statistics. They exist whenever the mean of the distribution exists, even though some higher moments may not exist, and are relatively robust to the effects of outliers. Based upon the moments in equation (13), we can derive explicit expressions for the L-moments of as infinite weighted linear combinations of the means of suitable TG-W order statistics. They are linear functions of expected order statistics defined by

Probability Weighted Moments
Generally, the PWM method can be used for estimating parameters of a distribution whose inverse form cannot be expressed explicitly. The PWMs are expectations of certain functions of a random variable and they can be defined for any random variable whose ordinary moments exist. They have low variance and no severe bias and can compare favorably with estimators obtained by the maximum likelihood method. The From Equation (5)

Reliability estimation
The stress-strength model is the most widely approach used for reliability estimation. This model is used in many applications of physics and engineering such as strength failure and system collapse. In stress-strength modeling, = Pr( 2 < 1 ) is a measure of reliability of the system when it is subjected to random stress 2 and has strength 1 .
The system fails if and only if the applied stress is greater than its strength and the component will function satisfactorily whenever 1 > 2 . can be considered as a measure of system performance and naturally arise in electrical and electronic systems. Other interpretation can be that, the reliability, say , of the system is the probability that the system is strong enough to overcome the stress imposed on it. Let 1 and 2 be two independent random variables with TG-W( 1 , 1 , , ) and TG-W( 2 , 2 , , ) distributions. Then, the reliability is defined by 2 ) −1 .

Characterizations
Here, we provide characterizations of the GT-W distribution in terms of two truncated moments. This characterization result is based on a theorem (see Theorem 1 below) due to Glänzel (1987). The proof of Theorem 1 is given in Glänzel (1990). This result holds also when the interval is not closed. Moreover, as mentioned above, it could be also applied when the cdf does not have a closed form. Glänzel (1990) proved that this characterization is stable in the sense of weak convergence. is defined with a real function ℎ. Assume that , ℎ ∈ 1 ( ), ∈ 2 ( ) and is twice continuously differentiable and strictly monotone function on the set . Finally, assume that the equation ℎ = has no real solution in the interior of . Then is uniquely determined by the functions , ℎ and , particularly where the function is a solution of the differential equation ′ = ′ ℎ/( ℎ − ) and is the normalization constant, such that = 1.
Let be a random variable with density (5) . (15) The general solution of the above differential equation is where is a constant. There is a set of functions satisfying the differential equation (15) is given in Proposition 1 with = 0. Moreover, there are other triplets (ℎ, , ) satisfying the conditions of Theorem 1.

Maximum Likelihood Estimation
Several approaches for parameter estimation were proposed in the literature but the maximum likelihood method is the most commonly employed. The maximum likelihood estimators (MLEs) enjoy desirable properties and can be used when constructing confidence intervals and also in test statistics. The normal approximation for these estimators in large sample theory is easily handled either analytically or numerically. So, we consider the estimation of the unknown parameters for this family from complete samples only by maximum likelihood. Here, we determine the MLEs of the parameters of the new family of distributions from complete samples only. Let 1 , … , be a random sample from the TG-W distribution with parameters , , and . Let Θ =( , , , ) T be the (4 × 1) parameter vector. Then, the log-likelihood function for Θ, say ℓ = ℓ(Θ), is given by , Under standard regularity conditions when → ∞, the distribution of Θ can be approximated by a multivariate normal 4 (0, (Θ) −1 ) distribution to construct approximate confidence intervals for the parameters. Here, (Θ) is the total observed information matrix evaluated at Θ. The method of the re-sampling bootstrap can be used for correcting the biases of the MLEs of the model parameters. Interval estimates may also be obtained using the bootstrap percentile method. Likelihood ratio tests can be performed for the proposed family of distributions in the usual way.

Simulation Study
In this section, a brief simulation study is conducted to examine the performance of the MLEs of TGW parameters. Inverse transform method is used to generate random observations from TGW distribution. We generate 1000 samples of size, n =50, 100, 500 and n=1000 of TGW distribution. The evaluation of estimates was based on the bias of the MLEs of the model parameters, the mean squared error (MSE) of the MLEs. The empirical study was conducted with software R and the results are given in Table 1. The values in Table 1 indicate that the estimates are quite stable and, more importantly, are close to nominal values when goes to infinity. It is observed from Table 1 that the biases and MSEs decreases as n increases. The simulation study shows that the maximum likelihood method is appropriate for estimating the parameters of TGW distribution. In fact, the MSEs of the parameters tend to be closer to the zero when n increases. This fact supports that the asymptotic normal distribution provides an adequate approximation to the finite sample distribution of the MLEs. The normal approximation can be improved by using bias adjustments to these estimators.
The corresponding survival function is Parametric regression models to estimate univariate survival functions for censored data are widely used. A parametric model that provides a good fit to lifetime data tends to yield more precise estimates of the quantities of interest. Based on the LTGW density, we where = exp( ), = ( − )/ and is the number of uncensored observations (failures) and is the number of the censored observations. The MLE ̂ of the vector of unknown parameters can be evaluated by maximizing the log-likelihood (21). We use the statistical software R to determine the estimate .
Under standard regularity conditions, the asymptotic distribution of (̂− ) is multivariate normal +3 (0, ( ) −1 ), where ( ) is the expected information matrix. The asymptotic covariance matrix ( ) −1 of ̂ can be approximated by the inverse of the

Applications
In this section, we provide an application to real data set to illustrate the flexibility of the TGW distribution. The parameters are estimated by maximum likelihood method and R statistical software is used for computations. First, we describe the data sets and then determine the MLEs (and the corresponding standard errors) of the parameters. In order to compare models with the proposed distribution, we apply goodness-of-fit tests to verify which distribution fits better the real data set. The statistics Cramer von Mises (W*) and Anderson Darling (A*) are described in details in Chen and Balakrishnan (1995). The log-likelihood values and Akaike Information Criterion (AIC) are also obtained for all models and used to decide best model. In general, the smaller the values of these statistics, the better the fit to the data.
We compare the performance of the TGW distribution with other well-known families given in Table 2. More information can be provided in Figure 3 by a histogram of the data with fitted lines of the pdfs for all distributions. We present the plots of the fitted density, cumulative and survival functions with the probability-probability (P-P) plot for the TGW distribution in Figure 4. They reveal a good adjustment for the data of the estimated density, cumulative and survival functions of the TGW distribution.

Multiply censored relay data
The used data set represents the production relay and on a proposed design change ( = 35). Engineering experience suggested that lifetime has a Weibull distribution. Engineering sought to compare the production and proposed designs over the range of test currents. These data are also reported and analyzed in Cordeiro et al. (2017). LTGW regression model is adopted to analyze these data set. The variables involved in the study are: -observed thounsands of cycles; -censoring indicator (0=censoring, 1=lifetime observed) and 1 -production (16 amps, 26 amps, 28 amps). We consider the following regression model = 1 + 2 + , where has the LTGW density (17), for = 1, … ,35. Table 4 lists the MLEs of the model parameters of the LTGW regression model fitted to the current data and the loglikelihood, AIC and BIC statistics. Based on the Table 4, it is clear that 1 is statistically significant at the 5% level and then there is a significant difference among the levels of the production for the thousands of cycles. The plots in Figure 5(a) provide the Kaplan-Meier (KM) estimate and the estimated survival functions of the LTGW regression model. In view of Figure 5(a), there is no significant differences between the 26 and 28 amps levels survival functions. The plots of the hrf in Figure 5(b) corresponding to the thousands of cycles variable under the LTGW regression model indicate that the hrf is larger for 16 amps level than for 26 and 28 amps levels. Based on these plots, we conclude that the LTGW regression model provides a good fit to this data.

Conclusions
We introduce the new lifetime distribution named the Transmuted Geometric-Weibull (TGW) distribution. Some of its mathematical properties are obtained. The maximum likelihood method is used to estimate the model parameters and the performance of the maximum likelihood estimators are discussed in terms of biases and mean squared errors. Two applications of the proposed family prove empirically its flexibility to model the real data sets. The log location-scale regression model based on a new generated distribution is introduced and discussed by means of real data application. Finally, it is clear that the proposed distribution provides better fits than other competitive models for used data sets.