Research and prediction of the startups’ success on kickstarter platform

Authors

  • Nataliia V. Kuznietsova Educational and Scientific Complex "Institute for Applied System Analysis" of the National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute", Kyiv, Ukraine https://orcid.org/0000-0002-1662-1974
  • Yaroslav V. Grushko Educational and Scientific Complex "Institute for Applied System Analysis" of the National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute", Kyiv, Ukraine https://orcid.org/0000-0002-8442-5668

DOI:

https://doi.org/10.20535/SRIT.2308-8893.2019.3.02

Keywords:

Forecasting, Extreme Gradient Boosting Method, K-nearest Neighbor Method, Survival Models, Startups, Project Success, Kickstarter Platform

Abstract

The main purpose of the study, carried out in the work, was to identify and predict the success of new start-up projects. The task of predicting the success of one or another startup was solved, various methods of data analysis, such as methods of extreme gradient boosting and k-nearest neighbors, were used. They allowed to predict with high precision the success of the project, and the method of extreme gradient boosting was the most effective. The use of survival models allowed us to estimate the average time spent working on a successful startup, as well as identify those key industries for which startups become effective, predicting for each of them the required time to turn a progressive idea into a successful business. The most successful categories of start-up projects were also identified, and the time required to achieve the success (survival) of projects as a whole and for specific project categories was predicted. For this purpose, survival models were constructed on the basis of Cox proportional risks and Kaplan-Meyer models.

Author Biographies

Nataliia V. Kuznietsova, Educational and Scientific Complex "Institute for Applied System Analysis" of the National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute", Kyiv

Nataliia Kuznietsova,

Dr. Eng. Sc., an assistant professor at the department of mathematical methods of system analysis of Educational and Scientific Complex "Institute for Applied System Analysis" of the National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute", Kyiv, Ukraine.

Scientific interests: risk analysis, data mining, survival models, Bayesian networks, information technologies, system analysis.

Yaroslav V. Grushko, Educational and Scientific Complex "Institute for Applied System Analysis" of the National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute", Kyiv

Yaroslav Grushko,

a student at Educational and Scientific Complex "Institute for Applied System Analysis" of the National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute", Kyiv, Ukraine.

References

Conventional Wisdom Says 90% of Startups Fail. Data Says Otherwise // Fortune. — Updated June 2017. — Available at: http://fortune.com/2017/06/27/startup-advice-data-failure/

Why startups fail, according to their founders // Fortune. — Updated September 2014. — Available at: http://fortune.com/2014/09/25/why-startups-fail-according-to-their-founders/

Altman N.S. An introduction to kernel and nearest-neighbor nonparametric regression / N.S. Altman // The American Statistician. — 1992. — P. 175–185.

Classifier comparison // Scikit-learn. — Updated 2018. — Available at: https://scikit-learn.org/stable/auto_examples/classification/plot_classifier_comparison.html

XGBoost (eXtreme Gradient Boosting) // Distributed (Deep) Machine Learning Community. — Updated 2016. — Available at: https://github.com/dmlc/xgboost.

Xgboost 0.82 // Python Package Index (PyPI). — Updated 2019. — Available at: https://pypi.org/project/xgboost/.

Friedman J.H. Greedy Function Approximation: A Gradient Boosting Machine / J.H. Friedman // Reitz Lecture. — 1999.

Hastie T. 10. Boosting and Additive Trees / T. Hastie, R. Tibshirani, J.H. Friedman // The Elements of Statistical Learning. — 2009. — N 2. — P. 337–384.

XGBoost (eXtreme Gradient Boosting) // Distributed (Deep) Machine Learning Community. — Updated 2016. — Available at: https://github.com/dmlc/xgboost/tree/master/demo#machine-learning-challenge-winning-solutions.

Kickstarter projects // Kaggle. — Updated 2018. — Available at: https://www. kaggle.com/kemical/kickstarter-projects/version/3#ks-projects-201801.csv

Kuznietsova N.V. Information Technologies for Clients’ Database Analysis and Behaviour Forecasting / N.V. Kuznietsova // Selected Papers of the XVII International Scientific and Practical Conference on Information Technologies and Security (ITS 2017). — 2017. — P. 56–62. — Available at: http://ceur-ws.org/Vol-2067.

Allison P.D. Survival Analysis Using SAS / P.D. Allison // Cary. — 2010. — 324 p.

Cox D.R. Regression Models and Life-Tables / D.R. Cox // Journal of the Royal Statistical Society, Series B. — 1972. — Vol. 34, N 2. — P. 187–220.

Kickstarter // PBC. — Updated 2019. — Available at: https://www.kickstarter.com/.

Sikorsky Challenge. — Updated 2019. — Available at: https://www. sikorskychallenge.com/.

Published

2019-10-07

Issue

Section

Progressive information technologies, high-efficiency computer systems