Hiperparámetros
Entrenamiento online, por lotes y minilotes
Entrenamiento online, por lotes y minilotes
- Finnoff, W. (1994): "Diffusion approximations for the constant learning rate backpropagation algorithm and resistance to local minima." Neural Computation, 6(2):285–295. DOI https://doi.org/10.1162/neco.1994.6.2.285
- Wilson, D. R. & Martinez, T. R. (2003): "The general inefficiency of batch training for gradient descent learning." Neural Networks, 16, 1429–1451. DOI https://doi.org/10.1016/S0893-6080(03)00138-2
- Takéhiko Nakama (2009): "Theoretical analysis of batch and on-line training for gradient descent learning in neural networks." Neurocomputing 73(1-3):151-159. DOI https://doi.org/10.1016/j.neucom.2009.05.017
- Xu, Z.-B., Zhang, R. & Jing, W.-F. (2009): "When does online BP training converge?" IEEE Transactions on Neural Networks, 20(10):1529–1539. DOI https://doi.org/10.1109/TNN.2009.2025946
- Zhang, R., Xu, Z.-B., Huang, G.-B. & Wang, D. (2012): "Global convergence of online BP training with dynamic learning rate." IEEE Transactions on Neural Networks and Learning Systems, 23(2):330–341. DOI https://doi.org/10.1109/TNNLS.2011.2178315
Entrenamiento cíclico
Entrenamiento cíclico
- Heskes, T. & Wiegerinck, W. (1996): "A theoretical comparison of batch-mode, on-line, cyclic, and almost-cyclic learning." IEEE Transactions on Neural Networks, 7, 919–925. DOI https://doi.org/10.1109/72.508935
Algoritmos de entrenamiento de redes neuronales multicapa
Algoritmos de entrenamiento de redes neuronales multicapa
Momentos
Momentos
- Wang, J., Yang, J. & Wu, W. (2011): "Convergence of cyclic and almost-cyclic learning with momentum for feedforward neural networks." IEEE Transactions on Neural Networks, 22(8):1297–1306. DOI https://doi.org/10.1109/TNN.2011.2159992
- Asynchrony begets Momentum, with an Application to Deep Learning, arXiv 2016, https://arxiv.org/abs/1605.09774 (cf. http://stanford.edu/~imit/tuneyourmomentum/theory/ )
- YellowFin and the Art of Momentum Tuning, arXiv 2017, https://arxiv.org/abs/1706.03471 (cf. http://dawn.cs.stanford.edu/2017/07/05/yellowfin/ & http://mitliagkas.github.io/async-tuner/ ).
Un tercer término, como en el control PID
Un tercer término, como en el control PID
- Zweiri, Y. H., Whidborne, J. F. & Seneviratne, L. D. (2003): "A three-term backpropagation algorithm." Neurocomputing, 50, 305–318. DOI https://doi.org/10.1016/S0925-2312(02)00569-6
Backpropagation "emocional"
Backpropagation "emocional"
- Khashman, A. (2008): "A modified backpropagation learning algorithm with added emotional coefficients." IEEE Transactions on Neural Networks, 19(11):1896–1909. DOI https://doi.org/10.1109/TNN.2008.2002913
Extrapolaciones de pesos
Extrapolaciones de pesos
- Kamarthi, S. V. & Pittner, S. (1999): "Accelerating neural network training using weight extrapolations." Neural Networks, 12, 1285–1299. DOI https://doi.org/10.1016/S0893-6080(99)00072-6
Teoría de Lyapunov
Teoría de Lyapunov
- Yu, X., Efe, M.O. & Kaynak, O. (2002): "A general backpropagation algorithm for feedforward neural networks learning." IEEE Transactions on Neural Networks, 13(1):251–254
- Behera, L., Kumar, S. & Patnaik, A. (2006): "On adaptive learning rate that guarantees convergence in feedforward networks." IEEE Transactions on Neural Networks, 17(5):1116–1125
- Man, Z.,Wu, H. R., Liu, S. &Yu, X. (2006): "A new adaptive backpropagation algorithm based on Lyapunov stability theory for neural networks." IEEE Transactions on Neural Networks, 17(6):1580–1591.
Sliding mode control-based adaptive learning
Sliding mode control-based adaptive learning
- Sira-Ramirez, H., & Colina-Morles, E. (1995): "A sliding mode strategy for adaptive learning in Adalines." IEEE Transactions on Circuits and Systems I, 42(12), 1001–1012.
- Parma, G. G., Menezes, B. R. & Braga, A. P. (1998): "Sliding mode algorithm for training multilayer artificial neural networks." Electronics Letters, 34(1):97–98.
Aproximaciones sucesivas
Aproximaciones sucesivas
- Liang,Y. C., Feng, D. P., Lee, H. P., Lim, S. P. &Lee, K. H. (2002): "Successive approximation training algorithm for feedforward neural networks." Neurocomputing, 42, 311–322.
Aprendizaje en dos fases con ascenso del gradiente
Aprendizaje en dos fases con ascenso del gradiente
- Tang, Z., Wang, X., Tamura, H., & Ishii, M. (2003). "An algorithm of supervised learning for multilayer neural networks." Neural Computation, 15, 1125–1142.
Descenso global: TRUST [terminal repeller unconstrained subenergy tunneling]
Descenso global: TRUST [terminal repeller unconstrained subenergy tunneling]
- Barhen, J., Protopopescu, V. & Reister, D. (1997): "TRUST: A deterministic algorithm for global optimization." Science, 276, 1094–1097
- Cetin, B. C., Burdick, J. W. & Barhen, J. (1993): "Global descent replaces gradient descent to avoid local minima problem in learning with artificial neural networks." In Proceedings of IEEE International Conference on Neural Networks (pp. 836–842). San Francisco
- Chowdhury, P., Singh, Y. P., & Chansarkar, R. A. (1999). Dynamic tunneling techniquefor efficient training of multilayer perceptrons. IEEE Transactions on Neural Networks, 10(1):48–55.
Atractores terminales (órdenes de magnitud más rápidos)
Atractores terminales (órdenes de magnitud más rápidos)
- Zak, M. (1989): "Terminal attractors in neural networks." Neural Networks, 2, 259–274.
- Wang, S. D. & Hsu, C. H. (1991): "Terminal attractor learning algorithms for back propagation neural networks."In Proceedings of the International Joint Conference on Neural Networks (pp. 183–189). Seattle, WA.
- Jiang, M. &Yu, X. (2001): "Terminal attractor based back propagation learning for feedforward neural networks." In Proceedings of IEEE International Symposium on Circuits and Systems (ISCAS) (Vol. 2, pp. 711–714). Sydney, Australia.
Algoritmos robustos (desde el punto de vista estadístico)
Algoritmos robustos (desde el punto de vista estadístico)
- White, H. (1989): "Learning in artificial neural networks: A statistical perspective." Neural Computation, 1(4), 425–469
- Chen, D. S. & Jain, R. C. (1994): "A robust backpropagation learning algorithm for function approximation." IEEE Transactions on Neural Networks, 5(3):467–479.
- Chuang, C. C., Su, S. F. & Hsiao, C. C. (2000): "The annealing robust backpropagation (ARBP) learning algorithm." IEEE Transactions on Neural Networks, 11(5), 1067–1077.
- Pernia-Espinoza, A. V., Ordieres-Mere, J. B., Martinez-de-Pison, F. J. & Gonzalez-Marcos, A. (2005): "TAO-robust backpropagation learning algorithm." Neural Networks, 18, 191–204
Sin propagación de errores hacia atrás
Sin propagación de errores hacia atrás
- Brouwer, R. K. (1997): "Training a feed-forward network by feeding gradients forward rather than by back-propagation of errors." Neurocomputing, 16, 117–126. DOI https://doi.org/10.1016/S0925-2312(97)00020-9
Con neuronas lineales de salida
Con neuronas lineales de salida
- Manry, M. T., Apollo, S. J., Allen, L. S., Lyle,W. D., Gong,W., Dawson, M. S., et al. (1994): "Fast training of neural networks for remote sensing." Remote Sensing Reviews, 9, 77–96. DOI http://dx.doi.org/10.1080/02757259409532216
Inicialización de los pesos de la red
Inicialización de los pesos de la red
- Kolen, J. F. &Pollack, J. B. (1990). Backpropagation is sensitive to initial conditions. Complex Systems, 4(3), 269–280
- Lee,Y., Oh, S. H., & Kim, M.W. (1991): "The effect of initial weights on premature saturation in back-propagation training." In Proc. IEEE International Joint Conference on Neural Networks (Vol. 1, pp. 765–770). Seattle, WA.
- Nguyen, D. & Widrow, B. (1990): "Improving the learning speed of 2-layer neural networks by choosing initial values of the adaptive weights." In Proceedings of International Joint Conference on Neural Networks (Vol. 3, pp. 21–26). San Diego, CA.
- Wessels, L. F. A. & Barnard, E. (1992): "Avoiding false local minima by proper initialization of connections." IEEE Transactions on Neural Networks, 3(6):899–905
- Drago, G. & Ridella, S. (1992): "Statistically controlled activation weight initialization (SCAWI)." IEEE Transactions on Neural Networks, 3(4):627–631
- Thimm, G. & Fiesler, E. (1997): "High-order and multilayer perceptron initialization." IEEE Transactions on Neural Networks, 8(2):349–359.
- McLoone, S., Brown, M. D., Irwin, G., Lightbody, G. (1998): "A hybrid linear/nonlinear training algorithm for feedforward neural networks." IEEE Transactions on Neural Networks, 9(4):669–684.
- Yam, J. Y. F., Chow, T. W. S. (2001): "Feedforward networks training speed enhancement by optimal initialization of the synaptic coefficients." IEEE Transactions on Neural Networks, 12(2):430–434
Estimación paramétrica & técnicas de clustering
Estimación paramétrica & técnicas de clustering
- Denoeux, T., & Lengelle, R. (1993): "Initializing backpropagation networks with prototypes." Neural Networks, 6(3):351–363.
- Lehtokangas, M., Saarinen, J., Huuhtanen, P., & Kaski, K. (1995): "Initializing weights of a multilayer perceptron network by using the orthogonal least squares algorithm." Neural Computation, 7, 982–999.
- Smyth, S. G. (1992): "Designing multilayer perceptrons from nearest neighbor systems." IEEE Transactions on Neural Networks, 3(2):329–333.
- Weymaere, N. & Martens, J. P. (1994): "On the initializing and optimization of multilayer perceptrons." IEEE Transactions on Neural Networks, 5, 738–751.
- Lehtokangas, M., Saarinen, J., Huuhtanen, P., & Kaski, K. (1995). Initializing weights of a multilayer perceptron network by using the orthogonal least squares algorithm. Neural Computation, 7, 982–999.
- Yam, J. Y. F., & Chow, T.W. S. (2000): "A weight initialization method for improving training speed in feedforward neural network." Neurocomputing, 30, 219–232.
- Yam, Y. F., Chow, T. W. S. & Leung, C. T. (1997): "A new method in determining the initial weights of feedforward neural networks." Neurocomputing, 16, 23–32.
- Yam, J. Y. F. & Chow, T. W. S. (2001): "Feedforward networks training speed enhancement by optimal initialization of the synaptic coefficients." IEEE Transactions on Neural Networks, 12(2):430–434.
- Yam, Y. F., Leung, C. T., Tam, P. K. S. & Siu, W. C. (2002): "An independent component analysis based weight initialization method for multilayer perceptrons." Neurocomputing, 48, 807–818.
- Costa, P. &Larzabal, P. (1999): "Initialization of supervised training for parametric estimation." Neural Processing Letters, 9, 53–61.
Ajuste de la topología de la red
Ajuste de la topología de la red
Poda de redes neuronales (múltiples heurísticas)
Poda de redes neuronales (múltiples heurísticas)
- Mozer,M. C. & Smolensky, P. (1989): "Using relevance to reduce network size automatically." Connection Science, 1(1):3–16.
- Le Cun, Y., Denker, J. S., & Solla, S. A. (1990): "Optimal brain damage." In D. S. Touretzky (Ed.), Advances in Neural Information Processing Systems 2, pp. 598–605). San Mateo, CA: Morgan Kaufmann.
- Karnin, E. D. (1990): "A simple procedure for pruning back-propagation trained neural networks." IEEE Transactions on Neural Networks, 1(2):239–242.
- Sietsma, J. & Dow, R. J. F. (1991): "Creating artificial neural networks that generalize." Neural Networks, 4, 67–79.
- Hassibi, B., Stork, D. G. & Wolff, G. J. (1992): "Optimal brain surgeon and general network pruning." In Proceedings of IEEE International Conference on Neural Networks (pp. 293–299). San Francisco.
- Goh, Y. S. & Tan, E. C. (1994): "Pruning neural networks during training by backpropagation." In Proceedings of IEEE Region 10’s Ninth Ann. Int. Conf. (TENCON’94), pp. 805–808. Singapore
- Ponnapalli, P. V. S., Ho, K. C. & Thomson, M. (1999): "A formal selection and pruning algorithm for feedforward artificial neural network optimization." IEEE Transactions on Neural Networks, 10(4):964–968
- Chandrasekaran, H., Chen, H. H. & Manry, M. T. (2000): "Pruning of basis functions in nonlinear approximators." Neurocomputing, 34, 29–53.
- Castellano, G., Fanelli, A. M. & Pelillo, M. (1997): "An iterative pruning algorithm for feedforward neural networks." IEEE Transactions on Neural Networks, 8(3):519–531.
- Kanjilal, P. P. & Banerjee, D. N. (1995): "On the application of orthogonal transformation for the design and analysis of feedforward networks." IEEE Transactions on Neural Networks, 6(5):1061–1070.
- Teoh, E. J., Tan, K. C. & Xiang, C. (2006): "Estimating the number of hidden neurons in a feedforward network using the singular value decomposition." IEEE Transactions on Neural Networks, 17(6):1623–1629
- Zurada, J. M., Malinowski, A. & Usui, S. (1997): "Perturbation method for deleting redundant inputs of perceptron networks." Neurocomputing, 14, 177–193.
- Xing, H.-J. & Hu, B.-G. (2009): "Two-phase construction of multilayer perceptrons using information theory." IEEE Transactions on Neural Networks, 20(4):715–721.
- Cibas, T., Soulie, F. F., Gallinari, P. & Raudys, S. (1996): "Variable selection with neural networks." Neurocomputing, 12, 223–248.
- Stahlberger, A. & Riedmiller, M. (1997): "Fast network pruning and feature extraction using the unit-OBS algorithm." In M. C. Mozer, M. I. Jordan & T. Petsche (Eds.), Advances in Neural Information Processing Systems 9, pp. 655–661. Cambridge, MA: MIT Press.
- Levin, A. U., Leen, T. K. & Moody, J. E. (1994): "Fast pruning using principal components." In J. D. Cowan, G. Tesauro & J. Alspector (Eds.), Advances in Neural Information Processing Systems 6, pp. 35–42. San Francisco, CA: Morgan Kaufman.
- Tresp, V., Neuneier, R. & Zimmermann, H. G. (1997): "Early brain damage." In M. Mozer, M. I. Jordan & P. Petsche (Eds.), Advances in Neural Information Processing Systems 9, pp. 669–675) Cambridge, MA: MIT Press.
- Leung, C. S., Wong, K. W., Sum, P. F. & Chan, L. W. (2001): "A pruning method for the recursive least squared algorithm." Neural Networks, 14, 147–174
- Sum, J., Leung, C. S., Young, G. H. & Kan,W. K. (1999): "On the Kalman filtering method in neural network training and pruning." IEEE Transactions on Neural Networks, 10:161–166.
- Engelbrecht, A. P. (2001): "A new pruning heuristic based on variance analysis of sensitivity information. "IEEE Transactions on Neural Networks, 12(6):1386–1399.
Crecimiento de redes neuronales (proceso opuesto a su poda): Redes constructivas
Crecimiento de redes neuronales (proceso opuesto a su poda): Redes constructivas
- Mezard, M., & Nadal, J. P. (1989). "Learning in feedforward layered networks: The tiling algorithm." Journal of Physics A, 22, 2191–2203
- Frean, M. (1990): "The upstart algorithm: A method for constructing and training feedforward neural networks." Neural Computation, 2(2), 198–209
- Gallant, S. I. (1990): "Perceptron-based learning algorithms." IEEE Transactions on Neural Networks, 1(2), 179–191.
- Fahlman, S. E., & Lebiere, C. (1990): "The cascade-correlation learning architecture." In D. S. Touretzky (Ed.), Advances in Neural Information Processing Systems 2, pp. 524–532. San Mateo, CA: Morgan Kaufmann.
- Kwok, T. Y., & Yeung, D. Y. (1997): "Objective functions for training new hidden units in constructive neural networks." IEEE Transactions on Neural Networks, 8(5):1131–1148
- Lehtokangas, M. (1999): "Modelling with constructive backpropagation." Neural Networks, 12, 707–716.
- Moody, J. O., & Antsaklis, P. J. (1996): "The dependence identification neural network construction algorithm." IEEE Transactions on Neural Networks, 7(1):3–13.
- Liu, D., Chang, T. S., & Zhang, Y. (2002): "A constructive algorithm for feedforward neural networks with incremental training." IEEE Transactions on Circuits and Systems I, 49(12):1876–1879.
- Rathbun, T. F., Rogers, S. K., DeSimio, M. P., & Oxley, M. E. (1997): "MLP iterative construction algorithm." Neurocomputing, 17, 195–216.
- Setiono, R., & Hui, L. C. K. (1995): "Use of quasi-Newton method in a feed-forward neural network construction algorithm." IEEE Transactions on Neural Networks, 6(1):273–277.