Allow Windows to set blank environment variables
ClosedPublic

Authored by habibalamin on Jul 11 2017, 5:06 PM.

Diff Detail

Repository
rGHC Glasgow Haskell Compiler
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
There are a very large number of changes, so older changes are hidden. Show Older Changes
Phyx requested changes to this revision.Jul 17 2017, 12:49 PM

@habibalamin Harbormaster failed to build the patch. Can you take a look and resubmit.

libraries/base/cbits/Win32Utils.c
11 ↗(On Diff #13188)

Currently has a compile error on this include. Why do you need it? I assume it's for FreeEnvironmentStringsW? In which case you can just import Windows.h.

This revision now requires changes to proceed.Jul 17 2017, 12:49 PM
habibalamin edited edge metadata.

Link Shlwapi.lib file with pragma macro.

Phyx requested changes to this revision.Jul 17 2017, 3:46 PM
Phyx added inline comments.
libraries/base/cbits/Win32Utils.c
12 ↗(On Diff #13198)

Don't do this, if you need a library, add it to base's cabal file. This one won't be picked up by GHCi so you'll break loading base in the interpreter.

This revision now requires changes to proceed.Jul 17 2017, 3:46 PM
habibalamin edited edge metadata.
habibalamin marked an inline comment as done.

Add shlwapi to base.cabal extra-libraries.

Phyx retitled this revision from Allow Windows to set blank environment variables (fix #12494). to Allow Windows to set blank environment variables.Jul 18 2017, 2:38 AM
Phyx updated the Trac tickets for this revision.
Phyx edited the test plan for this revision. (Show Details)
Phyx added a comment.Jul 18 2017, 2:43 AM

Almost, while you rebase, could you also edit the revision and add a summary.
The summary will be made the commit message. Keep the summary lines under 80 characters
or the message will be rejected. The commit message will end up looking like this https://github.com/ghc/ghc/commit/f656fba19d0cefe05643ddea35d080ea332a6584

docs/users_guide/8.4.1-notes.rst
153

Look around in the file, there's a syntax we use to refer back to the trac tickets. add Trac #12494 as a reference.

libraries/base/base.cabal
357

could you add shlwapi to the list in the comment above and say what it provides? e.g. A collection of shell and environmental APIs.

Include Windows.h, which defines LPSTR and other typedefs.

Add GHC Trac ticket reference to release notes.

Phyx requested changes to this revision.Jul 18 2017, 2:32 PM

I've been looking into why Harbormaster is failing, when I noticed this:

[These functions are available through Windows XP Service Pack 2 (SP2) and Windows Server 2003. They might be altered or unavailable in subsequent versions of Windows.]
The tables in this document list wrapper functions from Shlwapi.dll that provide limited Unicode functionality to Windows 95, Windows 98, and Windows Millennium Edition (Windows Me).

It seems that these headers are out dated and Microsoft has this dll under the deprecated category. The only real API call here is to GetEnvironmentStringsW and FreeEnvironmentStringsW
which are provided by kernel32. Could you drop the dependency and use just standard portable C99/C11 string functions? That way it's less of a maintenance issue in the future and should build on harbormaster too.

I mean StrCmpNIW is just wcsncmp, etc. This would also allow the compiler to actually optimize the code instead of having to make an external call.

This revision now requires changes to proceed.Jul 18 2017, 2:32 PM

Wow, I don't know how I missed that. Where did you find that, though? Because I'm looking at this page: https://msdn.microsoft.com/en-us/library/windows/desktop/bb759954(v=vs.85).aspx, and I don't see anything mentioned about deprecation. I want to look at the documentation, so I can find the modern equivalents.

libraries/base/System/Environment.hs
196

“adopted that behaviour on POSIX” would be clearest, I think.

libraries/base/cbits/Win32Utils.c
11 ↗(On Diff #13188)

This is because of the StrCmpNIW, StringCchLengthW, and StringCbLengthW functions. I've added a pragma directive to fix it; the only other way is to change the compiler/linker/whatever invocation to make sure the library gets linked.

https://stackoverflow.com/a/5782088/905250

12 ↗(On Diff #13198)

How would I do it? Would I add shlwapi to this line:

extra-libraries: wsock32, user32, shell32, msvcrt, mingw32, mingwex

so it becomes:

extra-libraries: wsock32, user32, shell32, msvcrt, mingw32, mingwex, shlwapi

like that?

What the hell? Looks like some previous comments were being held and had never posted at all. Man, Phabricator is confusing as hell.

habibalamin marked 3 inline comments as done.Jul 18 2017, 2:40 PM
habibalamin added inline comments.
libraries/base/cbits/Win32Utils.c
11 ↗(On Diff #13188)

No, it's not. I'm an idiot who blindly trusts answers to Stack Overflow questions with no detail whatsoever.

Phyx added a comment.Jul 18 2017, 3:16 PM

What the hell? Looks like some previous comments were being held and had never posted at all. Man, Phabricator is confusing as hell.

When you reply to comments, phabricator doesn't post them yet. Much like github's new review mode. So it allows you to write all your comments, and then you have to scroll down the page and press submit.
On the bottom of the page you'll see the comments you have still in draft. This is done so it doesn't email on every comment :)

Phyx added a comment.Jul 18 2017, 3:29 PM

Wow, I don't know how I missed that. Where did you find that, though? Because I'm looking at this page: https://msdn.microsoft.com/en-us/library/windows/desktop/bb759954(v=vs.85).aspx, and I don't see anything mentioned about deprecation. I want to look at the documentation, so I can find the modern equivalents.

I was looking at https://msdn.microsoft.com/en-us/library/windows/desktop/bb759845(v=vs.85).aspx but a closer look seems to indicate that only the list of functions on that page is deprecated? Looking at the header I don't fully understand why harbormaster is failing. I assume you're testing on x86_64 Windows as well locally?

I do wish to avoid using these string compare functions if possible though. If you switched to https://msdn.microsoft.com/en-us/library/eywx8zcx.aspx these it should all work fine without an external dependency. And because GCC knows these, it can also do things like inline them etc. It also makes it easier to understand the code, I had to literally look up all these functions.

habibalamin marked an inline comment as done.Jul 18 2017, 3:39 PM
In D3726#106255, @Phyx wrote:

I was looking at https://msdn.microsoft.com/en-us/library/windows/desktop/bb759845(v=vs.85).aspx but a closer look seems to indicate that only the list of functions on that page is deprecated? Looking at the header I don't fully understand why harbormaster is failing. I assume you're testing on x86_64 Windows as well locally?

I was testing these on a local 20GB VM, but I got rid of it since several diffs ago and just started using Harbormaster to test it. I told bgamari on IRC, but I should have commented here, too. I just don't have the space on my 256GB drive to be supporting a 20GB VM for that long.

I do wish to avoid using these string compare functions if possible though. If you switched to https://msdn.microsoft.com/en-us/library/eywx8zcx.aspx these it should all work fine without an external dependency. And because GCC knows these, it can also do things like inline them etc. It also makes it easier to understand the code, I had to literally look up all these functions.

Sure, I'll use those functions instead. The only problem is, it might fail to build because of incompatible pointer types. Is const wchar_t * equivalent to LPCSTR or wchar_t * equivalent to LPSTR (which is what the environment functions take and return)? I'll have to check all the pointer types are compatible (scratch that, looks like they are: https://msdn.microsoft.com/en-us/library/windows/desktop/dd374131(v=vs.85).aspx ).

I'll probably get on the latest requested changes later today or tomorrow (~10pm here). However, I think we're close.

habibalamin edited edge metadata.

Use non-deprecated string functions.

Fix missing semicolon.

Phyx added a comment.Jul 19 2017, 9:10 AM

Sure, I'll use those functions instead. The only problem is, it might fail to build because of incompatible pointer types. Is const wchar_t * equivalent to LPCSTR or wchar_t * equivalent to LPSTR (which is what the environment functions take and return)? I'll have to check all the pointer types are compatible (scratch that, looks like they are: https://msdn.microsoft.com/en-us/library/windows/desktop/dd374131(v=vs.85).aspx ).

Yup, they are for our case as the build system also specifies -DUNICODE.

Initialize value to NULL at start to fix -Werror=maybe-uninitialized failure.

Damn. I was having this failure (not recognising WINDOWS_CCONV) on my local Windows machine. I still don't know why this happens, but it was initially a local-only problem (masking other errors that Harbormaster showed). I'm gonna try a rebuild, and I'm gonna set up my Windows VM again to debug this tomorrow.

Sorry, haven't had a chance to look at this yet. I'll have a couple of hours to look at it tonight after work

"habibalamin (حبيب الامين)" <noreply@phabricator.haskell.org> writes:

No worries; thanks for your work!

I need to head off to work, but I'll leave you with a screenshot of where I'm at. The error in the screenshot is a trivial one, easy to fix; just a basic type mismatch. I'll have to fix it when I get home, as I don't want to start a build right now.

I got the same errors on my Windows VM as I was getting on Harbormaster. Unfortunately, I just fixed it by manually using the if defined macros for each FFI function I was creating to decide on ccall or stdcall, instead of defining WINDOWS_CCONV (hopefully, windows_cconv.h will fix this). That seems to work and not the current macro block, despite so many other files using the same macro block without any issue. I didn't have any issues with that file initially, either (and I don't think it was because of other errors masking that one).

Phyx added a comment.Jul 28 2017, 2:47 AM

No problem, no rush :)

I got the same errors on my Windows VM as I was getting on Harbormaster. Unfortunately, I just fixed it by manually using the if defined macros for each FFI function I was creating to decide on ccall or stdcall, instead of defining WINDOWS_CCONV (hopefully, windows_cconv.h will fix this). That seems to work and not the current macro block, despite so many other files using the same macro block without any issue.

I assume you're referring to this error:

libraries\base\System\Environment\Blank.hsc:91:16: error:
223	    parse error on input ‘WINDOWS_CCONV’
224	   |
225	91 | foreign import WINDOWS_CCONV unsafe "getenv_windows"
226	   |                ^^^^^^^^^^^^^

This error happens because you're in an hsc file. hsc2hs as you know also uses
# as a preprocessor marker. The problem is however it doesn't recognise some CPP blocks such as #defines.
So when it sees your block it just processes it, doesn't recognise any of the blocks and strips them.

Instead you need to escape the CPP bits, so that once hsc2hs is done generating the hs file, CPP can still
process them correctly. To do that just add another # as hsc2hs will just strip away one level.

##if defined(mingw32_HOST_OS)
## if defined(i386_HOST_ARCH)
##  define WINDOWS_CCONV stdcall
## elif defined(x86_64_HOST_ARCH)
##  define WINDOWS_CCONV ccall
## else
##  error Unknown mingw32 arch
## endif
##endif

Is how you would typically do it in an hsc file.

Phyx added a comment.Jul 28 2017, 2:49 AM

Also please don't forget to update the summary of the ticket, as that becomes part of the commit message.
And keep the line width under 80 characters, or it will be rejected by the commit hooks.

Thanks for all the work!

In D3726#107177, @Phyx wrote:

This error happens because you're in an hsc file.

I knew it, I frigging knew it. I noted to myself at one point that the error had started occurring when I changed the file to .hsc (though I wasn't certain). At first, I suspected the CApiFFI language extension, then I suspected the file extension change. However, when I looked at libraries/base/System/Environment/ExecutablePath.hsc for comparison (I think that's where I got the block from), I didn't notice the difference. I must have absentmindedly ‘fixed’ the double hashes when I copied it in, thinking, “this is not how C macros work”.

hsc2hs as you know also uses # as a preprocessor marker. The problem is however it doesn't recognise some CPP blocks such as #defines.

I did not know either of those things. Thank you so much.

Instead you need to escape the CPP bits, so that once hsc2hs is done generating the hs file, CPP can still
process them correctly. To do that just add another # as hsc2hs will just strip away one level.

##if defined(mingw32_HOST_OS)
## if defined(i386_HOST_ARCH)
##  define WINDOWS_CCONV stdcall
## elif defined(x86_64_HOST_ARCH)
##  define WINDOWS_CCONV ccall
## else
##  error Unknown mingw32 arch
## endif
##endif

Is how you would typically do it in an hsc file.

Looks like ExecutablePath.hsc gets away with only doubling the new defines, not the if macros. I'll try that first.

In D3726#107180, @Phyx wrote:

Also please don't forget to update the summary of the ticket

You mean the Trac ticket?

Today I learned. Good catch, @Phyx!

Phyx added a comment.Jul 29 2017, 1:42 AM
In D3726#107180, @Phyx wrote:

Also please don't forget to update the summary of the ticket

You mean the Trac ticket?

No sorry, I meant this differential. Go to the top and on the right click edit differential
and you'll find a summary field.

Phyx requested changes to this revision.Jul 29 2017, 1:42 AM

Bumping out of review queue pending fixes.

This revision now requires changes to proceed.Jul 29 2017, 1:42 AM
habibalamin edited edge metadata.

I'm having a lot of trouble getting getenv_windows to work in C; I even
tried porting Matz' Ruby Implementation code and I just can't get it to
work. Memory access errors every time.

It works in GHCi with modified setEnv: lookup key <$> getEnvironment.
Nice, easy, simple.

Phyx added a comment.Jul 29 2017, 2:26 PM

the C code is the one in this diff?

Remove redundant imports from non-Windows OSes.

@Phyx yes, it's the one that was in this diff. I've removed it, I'm giving up. Sorry, I should have made it clear in the diff update message that I was moving on. I've just decided to use the Haskell implementation. @bgamari and I had a conversation about this and leant towards implementing in C for performance reasons (me assuming it would be less difficult than it's proven); in C, GetEnvironmentStrings returns a pointer to the environment variable block, and we can copy the relevant value, free the block and just marshal the single value, whereas in Haskell, we'd have to marshal the whole block of variables through to Haskell-land just to lookup one variable.

Anyway, I've converted it to Haskell (lookup <$> getEnvironment) and I'm moving on. I've got too many other things on the go that are being stalled because of this.

@Phyx won't the title form part of the commit message also? Is a summary necessary if the title is self-explanatory (this ticket has no summary currently)?

habibalamin updated this revision to Diff 13391.EditedJul 29 2017, 3:04 PM

Wrap lines longer than 80 columns.

If anyone does pick up the task of writing it in C, this might prove useful - http://utf8everywhere.org. I was under the mistaken assumption that the Windows API's Unicode functions were the ‘right’ ones to use for dealing with Unicode strings in C on Windows.

habibalamin marked 5 inline comments as done.Jul 29 2017, 4:31 PM
Phyx added a comment.Jul 30 2017, 2:25 AM

If anyone does pick up the task of writing it in C, this might prove useful - http://utf8everywhere.org. I was under the mistaken assumption that the Windows API's Unicode functions were the ‘right’ ones to use for dealing with Unicode strings in C on Windows.

I don't understand, Windows API Unicode functions are UTF-16 always and always take a widechar. The W in the name is for Wide after all. Yes UTF-16 and wchar_t are not the same thing, but that doesn't change the fact that the Wide APIs on Windows accept only UTF-16 https://msdn.microsoft.com/en-us/library/windows/desktop/dd374081(v=vs.85).aspx and the article doesn't contradict this.

In any case, If it works in Haskell, that's fine, I'll just wait for the build bots to catch up. Thanks!

Remove redundant import for Windows.

In D3726#107429, @Phyx wrote:

I don't understand, Windows API Unicode functions are UTF-16 always and always take a widechar. The W in the name is for Wide after all. Yes UTF-16 and wchar_t are not the same thing, but that doesn't change the fact that the Wide APIs on Windows accept only UTF-16 https://msdn.microsoft.com/en-us/library/windows/desktop/dd374081(v=vs.85).aspx and the article doesn't contradict this.

I was not aware of the details of Unicode on Windows. I wanted this to work with Unicode same as the other environment functions, and the Windows documentation says that the W suffix on functions refers to Unicode versions of those functions. I did not know it was UTF-16 and not UTF-8.

By the way, you mentioned updating the ticket summary; is this required, or is it okay if the title is sufficient? How does the ticket title fit in to the final commit?

Phyx accepted this revision.Jul 30 2017, 5:45 PM

By the way, you mentioned updating the ticket summary; is this required, or is it okay if the title is sufficient? How does the ticket title fit in to the final commit?

It's not required, typically I would uses a. Very short title to say what it does and the summary for they why. Such that someone just browsing git log has basic context the change
Without having to visit the trac ticket. But everyone has different styles.

The title should be fine, thanks for sticking with it! I'll push it in a few hours.

This revision was automatically updated to reflect the committed changes.

Can we add this to the 8.4.1 milestone?

Can we add this to the 8.4.1 milestone?

The ticket is already milestoned for 8.4.1. Is that what you mean?

Yes, looks like Phyx just set it ~4 hours ago.