A general theory on rates of convergence of the least-squares projection estimate in multiple regression is developed. The theory is applied to the functional ANOVA model, where the multivariate regression function is modeled as a specified sum of a constant term, main effects (functions of one variable) and selected interaction terms (functions of two or more variables). The least-squares projection is onto an approximating space constructed from arbitrary linear spaces of functions and their tensor products respecting the assumed ANOVA structure of the regression function. The linear spaces that serve as building blocks can be any of the ones commonly used in practice: polynomials, trigonometric polynomials, splines, wavelets and finite elements. The rate of convergence result that is obtained reinforces the intuition that low-order ANOVA modeling can achieve dimension reduction and thus overcome the curse of dimensionality. Moreover, the components of the projection estimate in an appropriately defined ANOVA decomposition provide consistent estimates of the corresponding components of the regression function. When the regression function does not satisfy the assumed ANOVA form, the projection estimate converges to its best approximation of that form.