I find the entire effect size measure one-letter “zoo” difficult to understand.
I started with Cohen’s effect size measures, especially d and f or f-squared, when I added power and sample size computation.
The main question is whether the effect size measure should follow a basic definition or should it be the effect size measure appropriate for power computation for specific hypothesis tests.
For statsmodels I started to move away from using the basic Cohen’s effect size measures for power computation and use instead normalized (divided by nobs) noncentrality parameters.
example: variance heterogeneity in t-test or Anova.
Cohen’s d and f effect size assumes homogeneous variance across groups or samples. This means it is not fully appropriate for Welch t-test or Anova when variances are allowed to differ.
For example in statsmodels.stats.oneway.effectsize_oneway - statsmodels 0.15.0 (+824) I allow for an unequal_var option, so the effect size measure can be used for power computation for Welch and Brown-Forsythe mean anovas. see Notes section
About Cohen’s d
I was initially confused for some time because Cohen’s d divides by the (pooled) population standard deviation even in the 2-sample case. In that case it is not non-centrality / nobs.
Related: Statsmodels does not yet have eta or similar measure for regression models or anova after linear regression. The main reason is that I could not decide which definition to use.
For power computation it is much simpler to use the normalized non-centrality for the specific hypothesis test in regression models.
I have many stalled issues for effect sizes in statsmodels. So, if scipy gets the basic effect size measures, then I have to worry less about those and keep focusing on what’s appropriate for power and sample size computation. 
example: Measures of effect size · Issue #5896 · statsmodels/statsmodels · GitHub
The main point for me is that the standard effect size measures rely on very strong assumptions, like variance homogeneity or balanced sample in multiway anova. In statsmodels I try to emphasize methods that are robust to violations of these assumptions. Using standard effect size measures can be misleading if we use hypothesis tests that do not rely on these assumptions or are robust to some misspecification.