Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

System.Text.RegularExpressions.Generator doesn't support multi-targeted projects, even if GeneratedRegexAttribute is shimmed #104212

Open
madelson opened this issue Jun 30, 2024 · 11 comments

Comments

@madelson
Copy link
Contributor

Description

In multitargeted projects (e.g. netstandard2.0 + .net8.0), the regex analyzer often suggests a refactor to use GeneratedRegex.

However, GeneratedRegex isn't compatible with older frameworks, both because the attribute does not exist and because the source generator doesn't run on the older frameworks.

It would be nice if the generator would simply emit a cached field for frameworks that don't support full source generation, just like it does today for older language versions. The user would still have to shim the attribute themselves, but this is easy to do, could be done by the codefix, or could probably even be part of the generator itself.

This would allow multi-targeted libraries to take advantage of generated regex performance for the builds that target newer frameworks while still supporting netstandard2.0.

Reproduction Steps

Try to use GeneratedRegex in a multi-targeted project

Expected behavior

It would be nice if this "just worked", but at least it should stop suggesting a codefix which breaks the build.

Actual behavior

Generator does not run

Regression?

No

Known Workarounds

I can manually fork the generation with #if:

#if NET
    [StringSyntax(StringSyntaxAttribute.Regex)]
#endif
    private const string MyRegexPattern = @"a.*b";

#if !NET
    private static Regex? MyRegexCache;
    private static Regex MyRegex() => MyRegexCache ??= new(MyRegexPattern);
#else
    [GeneratedRegex(MyRegex)]
    private static partial Regex MyRegex();
#endif

Configuration

.NET 8

Other information

n/a

@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Jun 30, 2024
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-text-regularexpressions
See info in area-owners.md if you want to be subscribed.

@stephentoub
Copy link
Member

stephentoub commented Jun 30, 2024

It would be nice if the generator would simply emit a cached field for frameworks that don't support full source generation

It's part of the ref pack for the target framework. It's not that it doesn't emit code downlevel, it's that it doesn't even exist downlevel.

@stephentoub
Copy link
Member

stephentoub commented Jun 30, 2024

It would be nice if this "just worked", but at least it should stop suggesting a codefix which breaks the build.

This is the case for all analyzers in .NET that look for a particular newer API and suggest upgrading to it, not just the regex analyzer.

@madelson
Copy link
Contributor Author

It's part of the ref pack for the target framework. It's not that it doesn't emit code downlevel, it's that it doesn't even exist downlevel.

I can see that, but it seems like the System.Text.Json generator does exist in older frameworks, so I wondered if it were possible for the same to be true with regex generation.

This is the case for all analyzers in .NET that look for a particular newer API and suggest upgrading to it, not just the regex analyzer.

Agreed. I wish this weren't the case and it seems like it should be solvable, but that would be a very different project.

@stephentoub
Copy link
Member

but it seems like the System.Text.Json generator does exist in older frameworks

STJ ships as a nuget package for downlevel use and never shipped in-box on .NET Framework.

@RenderMichael
Copy link
Contributor

I agree with the sentiment, it would be very nice if the Regex generator could ship for standard 2.x, even if it does nothing but make a normal cached regex instance. It would remove the complexity/ optimization tradeoff here.

@julealgon
Copy link

It would be nice if the generator would simply emit a cached field for frameworks that don't support full source generation

It's part of the ref pack for the target framework. It's not that it doesn't emit code downlevel, it's that it doesn't even exist downlevel.

@stephentoub I ran into this scenario last week when trying to see if I could get Regex generation on our older project to work, and saw that the dll where the generator is implemented is not exposed in NuGet but only as part of the app.ref packages.

Is that purely a maintenance-related matter on your side, in the sense that you don't need to worry about exposing a lot of some of these "base" libraries as a dedicated NuGet, or is there more to it?

I assume that's why the decision was made but figured I'd ask to clarify.

@stephentoub
Copy link
Member

Is that purely a maintenance-related matter on your side, in the sense that you don't need to worry about exposing a lot of some of these "base" libraries as a dedicated NuGet, or is there more to it?

It's related but more about the complexity of the source generator. Today, the source generator is part of a .NET version, effectively providing the implementation of APIs in that release. That means it needn't be concerned about things like using APIs that aren't available, having different code paths to follow based on which APIs exist and which don't. It means it can be kept in sync with RegexOptions.Compiled, as the code for each is effectively 1:1. It means it doesn't need to be concerned about being used on newer or older versions, just the one it's a part of. Etc. Changing that adds significant complexity, both in implementation, in deployment management, in support, etc. We've accepted that cost when we've effectively had to, e.g. when the library with which the generator is associated itself ships out-of-band in a nuget package, but we've avoided doing so otherwise. We do not currently intend to do otherwise for regex.

@steveharter
Copy link
Member

Moving to future; please close if the discussion leads that way (i.e. not adding an additional target for netstandard).

@steveharter steveharter added this to the Future milestone Jul 3, 2024
@steveharter steveharter removed the untriaged New issue has not been triaged by the area owner label Jul 3, 2024
@madelson
Copy link
Contributor Author

madelson commented Jul 3, 2024

WRT maintenance, I feel like this could be done in a way that wouldn’t touch the core generator code at all.

Imagine there were a new package targeting netstandard that did 2 things:

  1. shimmed the GeneratedRegexAttribute in the same way as packages like IsExternalInit
  2. Provided a source generator that would generate the cached field behind a framework #if that ensure it won’t light up on any framework that supports real regex generation.

This only code this package would want to potentially share with the main dotnet runtime source is the attribute definition and the source generator logic for locating the attribute usages, but frankly both of those are optional.

@RenderMichael
Copy link
Contributor

I think the main concern here with targeting netstandard is that eg. .NET 7 would be a supported platform, but which is incompatible with the source gen (eg. SearchValues is unavailable). The generator code is only meant to work with the latest source gen.

I have a more unconventional suggestion. What if a nuget package is released which targets every platform in netstandard 2.0 that doesn’t support GeneratedRegex (that’s .NET 6 and below)? That way, .NET 7 is not a supported target, but it’s not an issue because support is there out of the box.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants