stackage-server/src/Stackage/Database/Github.hs
Alexey Kuleshevich f5e147ab97
Integration with Pantry and usage of new stackage-snapshots:
* Moved all extensions into modules that are using them, rather than globally,
  since they mess up ghci session and introduce conflicts among
  packages. Removed those from `.ghci` file as well
* Redesigned the schema to use Pantry and moved it into it's own module
* Switched all of the db and cron related stuff to RIO. Yesod part is
  still on classy-prelude
* Got pantry to update stackage-server database from hackage
* Got import of stackage-snapshots implemented
* Moved some logic from all-cabal-tool
* Switched everything to `PackageNameP`, `VersionP`, etc. from a la Text.
* Fixed haddock, so it now does proper redirects and pipes the docs
  correctly. Also implemented piping of json files from S3 bucket,
  so index-doc.json is also served by stackage-server thus making
  Ctrl+S feature work properly on haddock. Fix for commercialhaskell/stackage#4301
* Import of modules is done through cabal file parsing, which slows
  down the initial import process drastically, but incremental update
  is not a problem.
* Just as with modules, dependencies are also imported from cabal file.
* In general improved type safety by introducing a few data types:
  eg. `ModuleNameP`, `HackageCabalInfo`, and many more.
* Implemented pulling of deprecation map from hackages and storing it in db
* Implementation of forward/backward dependencies within a snapshot only.
* Drastically improved performance of cron import job, by checking which
  snapshots are not up to date
* Implemented pulling haddock list from S3 bucket. Modules that have
  documentation are marked from the availability of actual haddock. This
  process happens concurrently with snapshots loading.
* Rearranged modules a bit:
  * github related functions went into it's own module
  * cron related functions where moved from Database to Cron module
  * Split up some functions to reduce individual complexity
* Parallelized package loading in cron job
* Implemented parsed cabal file caching.
* All queries where reqritten with esqueleto
* Syntactic improvements:
  * Added stylish-haskell config
  * Formatted all imports and extensions with stylish-haskell.
  * Fixed inconsistent indentation across all modules
* Many improvements to the package page as well as few others.
* Reimplemented hoogledb creation.
* Dropped dependency on tar in favor of tar-conduit
* Added cli for stackage-server-cron
* Add cabal sha and size to the package page
* Fixed links in hoogle searches. Improved type safety for a hoogle handler
* satckage-server-cron is customizable with cli arguments

Final adjustments for the new stackage server release:

* Upgrade to lts-13.16.
* Stackage server related code has been merged to pantry. Made the code
  compatible with the newer version pantry
* Added cli '--snapshots-repo'
* Add readme to package page
* Adjust snapshots expected format:
  * Added `publish-time`
  * Removed name `field`
  * `compiler` field is now in the `resolver` field with fallback to
    the root
2019-04-30 17:10:33 +03:00

75 lines
2.6 KiB
Haskell

{-# LANGUAGE NamedFieldPuns #-}
{-# LANGUAGE NoImplicitPrelude #-}
module Stackage.Database.Github
( cloneOrUpdate
, lastGitFileUpdate
, getStackageContentDir
, GithubRepo(..)
) where
import qualified Data.ByteString.Lazy.Char8 as LBS8
import RIO
import RIO.Directory
import RIO.FilePath
import RIO.Process
import RIO.Time
data GithubRepo = GithubRepo
{ grAccount :: !String
, grName :: !String
} deriving Show
gitLog
:: (MonadReader env m, HasLogFunc env, HasProcessContext env,
MonadIO m) =>
FilePath -> String -> [String] -> m LBS8.ByteString
gitLog gitDir filePath args =
withWorkingDir gitDir $ proc "git" ("log" : (args ++ [filePath])) readProcessStdout_
-- | From the git commit log infer the timestamp when the file was changed last .
lastGitFileUpdate ::
(MonadReader env m, HasLogFunc env, HasProcessContext env, MonadUnliftIO m)
=> FilePath -- ^ Root dir of the repository
-> FilePath -- ^ Relative path of the file
-> m (Either String UTCTime)
lastGitFileUpdate gitDir filePath = do
lastCommitTimestamps <- gitLog gitDir filePath ["-1", "--format=%cD"]
parseGitDate rfc822DateFormat lastCommitTimestamps
where
parseGitDate fmt dates =
case listToMaybe $ LBS8.lines dates of
Nothing -> return $ Left "Git log is empty for the file"
Just lbsDate ->
mapLeft (displayException :: SomeException -> String) <$>
try (parseTimeM False defaultTimeLocale fmt (LBS8.unpack lbsDate))
-- | Clone a repository locally. In case when repository is already present sync it up with
-- remote. Returns the full path where repository was cloned into.
cloneOrUpdate ::
(MonadReader env m, HasLogFunc env, HasProcessContext env, MonadIO m)
=> FilePath -- ^ Path where the repo should be cloned
-> GithubRepo -- ^ Github user or organization name together with repository name
-> m FilePath
cloneOrUpdate root GithubRepo {grAccount, grName} = do
exists <- doesDirectoryExist dest
if exists
then withWorkingDir dest $ do
proc "git" ["fetch"] runProcess_
proc "git" ["reset", "--hard", "origin/master"] runProcess_
else withWorkingDir root $
proc "git" ["clone", url, grName] runProcess_
return dest
where
url = "https://github.com/" <> grAccount <> "/" <> grName <> ".git"
dest = root </> grName
getStackageContentDir ::
(MonadReader env m, HasLogFunc env, HasProcessContext env, MonadIO m)
=> FilePath
-> m FilePath
getStackageContentDir rootDir =
cloneOrUpdate rootDir (GithubRepo "commercialhaskell" "stackage-content")